Science topic

Data Management - Science topic

Explore the latest questions and answers in Data Management, and find Data Management experts.
Questions related to Data Management
  • asked a question related to Data Management
Question
6 answers
  • Machine Learning Integration: AI enhances predictive modelling in biostatistics.
  • Big Data Analytics: Analysing massive datasets improves disease detection.
  • Bayesian Statistics: Prior knowledge refines personalized treatment analysis.
  • Bioinformatics Fusion: Genomic data analysis reveals disease mechanisms.
  • Predictive Modeling: Forecasts guide personalized healthcare and treatments.
  • Cloud Computing: Enables real-time, scalable data sharing solutions.
  • Data Privacy Focus: Compliance ensures secure biomedical data management.
Relevant answer
Answer
Dear Anitha Roshan Sagarkar – thank you for giving me a moment of amusement. Bruce Weaver – I guess your response was the best out of two, which is some kind of distinction…
I do have to tell you, however, Anitha Roshan Sagarkar , that researchgate metrics are no longer influenced by posting nonsense questions. There was an epidemic of this – people using ChatGPT even, when they were clearly not clever enough to think of their own questions, if you can imagine this. As a result of this epidemic of junk questions, RG eliminated their metric that was based on the quality of your community participation. I'm sure you will be disappointed by this, but in the end the forum became overrun by people – often operating as a cartel – who were recommending each other's rubbish questions and answers.
So you can stop now, long story short.
  • asked a question related to Data Management
Question
3 answers
Call for Chapters
A Comprehensive Guide for Novice Researchers in Clinical Trials Elsevier, Academic Press Imprint Series: Next Generation Technology Driven Personalized Medicine and Smart Healthcare For more information on the series, visit Next Generation Technology Driven Personalized Medicine.
Call for Chapters
Introduction to the Theme
The landscape of clinical trials is evolving rapidly, with increasing emphasis on personalized medicine, innovative methodologies, and technology-driven approaches. This book, A Comprehensive Guide for Novice Researchers in Clinical Trials, aims to provide an accessible, in-depth foundation for early-stage researchers and professionals in the field. Topics include research methods, trial design, ethics, data management, and regulatory insights specific to Saudi Arabia. The objective is to create a resource that bridges theoretical foundations with practical applications in clinical trials, addressing the needs of today’s healthcare researchers.
Objectives of the Book
This book is designed to:
  • Equip novice researchers with a comprehensive understanding of clinical trial methodologies and requirements.
  • Introduce essential aspects of clinical research, from trial design to data management, while highlighting ethics and regulatory practices.
  • Serve as a Scopus-indexed reference that leverages Elsevier’s ELSA platform, making it accessible to a broad academic and professional audience.
Table of Indicative Chapters
  1. Introduction to Health Research Methods
  2. History of Clinical Trials
  3. Clinical Trial Designs
  4. Clinical Trial Essentials
  5. Ethics and Good Clinical Practice in Clinical Trials
  6. Trial Protocol Development
  7. Clinical Research Site Operation
  8. Clinical Data Management
  9. Clinical Trial Monitoring
  10. Principles of Statistics in Clinical Trials
  11. Reporting Clinical Trials
  12. Essentials of Project Management
  13. Regulatory Affairs of Clinical Trials in Saudi Arabia
  14. Training Programs and Job Opportunities in the Clinical Trial Industry
Important Guidelines for Contributors
  • Submission Platform: Contributions will be managed through Elsevier’s ELSA platform.
  • Proposal Submission: A chapter proposal (300-500 words) is required for initial review. Detailed guidelines for authors, a sample chapter, and sample chapter abstract are attached for reference.
  • Manuscript Preparation: Use MS Word with consistent formatting (bold, font size) for different heading levels. Each chapter should contain an abstract (100-150 words) and 5-10 keywords. Refer to the Elsevier Manuscript Preparation Guidelines for specific formatting instructions.
  • Artwork and Figures: Figures and tables should be submitted separately, with high-resolution images in JPG or TIFF format as per the provided guidelines.
  • Permissions: Contributors are responsible for obtaining permissions for any third-party material. An artwork list detailing all figures and tables with appropriate permissions is required upon manuscript submission.
  • Language and Style: Both British and US English are acceptable; however, authors must remain consistent within their chapters.
  • Reference Style: Use either the Harvard (Name-Date) or Vancouver (Numbered) style, as outlined in the guidelines.
Timeline
  • Submission of Chapter Proposals (300-500 words): December 5, 2024
  • Acceptance of Book Chapter Proposals: December 10, 2024
  • Full Chapters Due: February 15, 2025
  • Reviews to Authors: March 5, 2025
  • Final Chapters to ELSA: April 1, 2025
  • Publication: Quarter 4, 2025
Editorial and Contact Information
Editors
Managing Editor
For further information, please refer to the attached author guidelines, sample chapters, and sample abstract. We look forward to receiving your proposals and contributions to this impactful project.
Relevant answer
Answer
I've had some queries regarding the proposal submission I've mailed to all editors regarding the same. I'm yet to get any reply
  • asked a question related to Data Management
Question
1 answer
I've posted 4 questions in an anonymous form, so you can be completely honest on this often sensitive matter. (Will take you only 60 seconds)
https://forms.office.com/e/yjLfvGn5x0 This is your chance to be heard! I will post a link to the resuts afterwards!
Relevant answer
Answer
15 hours a week
  • asked a question related to Data Management
Question
2 answers
I'm currently seeking postdoctoral research opportunities in multidisciplinary areas within Computer Science, with an interest in both academic and industry settings. My research interests include advanced cloud-based data management for smart buildings, NLP for low-resource languages like Amharic, AI and machine learning, data science and big data, human-computer interaction, and robotics. I'm open to discussing potential opportunities and collaborations in these fields. Please feel free to contact me if you are aware of any suitable positions.
Relevant answer
Answer
Dear Dagim Sileshi Dill,
I would recommend the use of Artificial Intelligence in the Internet of Things as a postdoc research area in computer science with multidisciplinary applications.
For this purpose, I would analyze the use of Digital Twinning for the realization of various Intelligent Services.
See my presentation:
Here, Fig. 11 shows the most important areas of application of Digital Twins.
The article "Intelligent IoT - Replicating human cognition in the Internet of Things" can also help you:
Best regards and much success
Anatol Badach
  • asked a question related to Data Management
Question
1 answer
InfoScience Trends offers comprehensive scientific content that delves into various facets of information science research. This includes but is not limited to topics such as information retrieval, data management, information behavior ethics and policy, human-computer interaction, information visualization, information literacy, digital libraries, information technology, information systems, social informatics, data science, and more.
Relevant answer
Answer
yes
  • asked a question related to Data Management
Question
2 answers
How do AI applications differ from smart technologies
(medical devices, digital diagnostics, data management
systems) already used in medical practice? Can we establish some distinct differences between these two technologies and how they relate in functionality and use in the medical field.
Relevant answer
Answer
Hi
while traditional smart technologies in medicine have provided valuable tools for improving healthcare, AI applications bring a new level of intelligence, adaptability, and sophistication, offering capabilities like learning, personalization, and autonomous decision-making that were not possible before.
  • asked a question related to Data Management
Question
6 answers
Looking for an effective research method - my current method is time consuming.
Relevant answer
Answer
Global perspectives on research management
With global research spending increasing rapidly, and a growing emphasis on international collaborations, professional management of research is essential. University leaders around the world are starting to see the value of building a sustainable research office, along with the research manager role. However, in many respects these roles are still highly adapted to their local research and funding environments.
To understand the growing trends in research management, and how the research manager’s role is changing in light of global challenges, we interviewed senior representatives of Research Management and Administration (RMA) organisations in Africa, Asia, Australasia, Europe, North and South America...
  • asked a question related to Data Management
Question
5 answers
i am currently working on factors that affect capital structures of firms in Tanzania, one of the factors that i am working on is diversification.
Relevant answer
Answer
To calculate the entropy measure of product/business diversification in STATA, you can use the trade entropy index. The trade entropy index is defined as the sum across destinations of the squared export shares for the region under study to all destinations. You can also calculate geographical sales dispersion by using entropy index of company sales per country.
  • asked a question related to Data Management
Question
1 answer
For implementing a Master Data Management (MDM) System for an Insurance Regulator, what should be the Key Performance Indicators (KPIs) during implementation and during operations?
Relevant answer
Answer
Master Data Management (MDM) is the process of managing and maintaining a single, consistent, and accurate source of truth for critical data elements within an organization, such as customer data, product data, and other key data domains. Implementing an MDM solution can be a complex undertaking that requires careful planning, execution, and monitoring. Key Performance Indicators (KPIs) can be used to assess the success and effectiveness of an MDM implementation. Here are some examples of KPIs that can be used for evaluating an MDM implementation:
  1. Data accuracy: Measure the accuracy of data managed by the MDM solution by tracking the percentage of data records that are accurate and free of errors, duplicates, or inconsistencies. This can be assessed through data profiling, data quality checks, and data validation processes.
  2. Data completeness: Measure the completeness of data managed by the MDM solution by tracking the percentage of data records that are complete and contain all required data elements. This can be assessed through data completeness checks and data validation processes.
  3. Data governance compliance: Measure the adherence to data governance policies and standards as defined by the MDM implementation. This can include tracking the percentage of data records that comply with data governance rules, such as data naming conventions, data classification, and data security requirements.
  4. Data integration and consolidation: Measure the effectiveness of data integration and consolidation efforts within the MDM solution by tracking the number of data sources integrated, the time taken to integrate and consolidate data, and the quality of the resulting master data.
  5. Data quality improvement: Measure the improvement in data quality achieved through the MDM implementation by comparing data quality metrics before and after the implementation. This can include tracking changes in data accuracy, completeness, consistency, and timeliness.
  6. Data retrieval and accessibility: Measure the ease and speed of retrieving data from the MDM solution by tracking data retrieval and query performance metrics, such as response time, query execution time, and user satisfaction with data retrieval capabilities.
  7. Data usage and adoption: Measure the adoption and usage of the MDM solution by tracking the number of users accessing the system, the frequency of data updates, and the level of user satisfaction with the MDM solution. This can also include tracking the number of applications and systems that are integrated with the MDM solution.
  8. Cost savings and efficiency gains: Measure the cost savings and efficiency gains achieved through the MDM implementation by tracking the reduction in data redundancies, data errors, and manual data management efforts. This can also include tracking the time and effort saved in data integration, data validation, and data correction processes.
  9. Business impact: Measure the impact of the MDM implementation on key business outcomes, such as improved customer satisfaction, increased revenue, reduced time-to-market for new products, and enhanced decision-making capabilities. This can be assessed through qualitative and quantitative measures tied to specific business goals and objectives.
It's important to note that the KPIs for an MDM implementation may vary depending on the organization's specific goals, requirements, and industry context. It's essential to define and track relevant KPIs that align with the organization's strategic objectives and desired outcomes from the MDM implementation. Regular monitoring and review of these KPIs can provide insights into the effectiveness and success of the MDM implementation and identify areas for improvement.
  • asked a question related to Data Management
Question
4 answers
Through my research I've developed a data management and analysis software, and am looking for beta testers to help me improve it.
If you are interested you can visit the website at
or send me a message.
Relevant answer
Answer
I'm happy to announce that we are currently rolling out the new and much improved version of the software to our beta testers.
If you would like to give it a try please sign up at our new website www.thot.so, or send me an email at brian@thot.so
(Jan Vališ sorry for the extraordinarily late reply, I haven't checked this thread for quite a while.)
  • asked a question related to Data Management
Question
5 answers
climate action, cultural heritage, digital data management survey
Relevant answer
Answer
Please contact Tactis.fr
  • asked a question related to Data Management
Question
4 answers
What benefits? In which research center and university?
In the areas of blockchain, data management, and data security for AI or DL, which centers are most qualified or have investigations in these areas?
Relevant answer
Answer
Dear António Brandão,
As far as I know, post doctoral is self-guided research activity. One must do it to extend the PhD work.
  • asked a question related to Data Management
Question
3 answers
I have six kinds of compounds which I then tested for antioxidant activity using the DDPH assay and also anticancer activity on five types of cell lines, so I got two types of data groups:
1. Antioxidant activity data
2. Anticancer activity (5 types of cancer cell line)
Each data consisted of 3 replications. Which correlation test is the most appropriate to determine whether there is a relationship between the two activities?
Relevant answer
Answer
Just do logistic regression is what I had in mind. The DV might be antcancer activity (yes /no) same for antioxidant activity. Best wishes David Booth
  • asked a question related to Data Management
Question
5 answers
This is a bit of a meta-question. I've developed a piece of software to help researchers with data management and analysis. The software is at a point now where I am looking for more beta testers.
Is there a good thread on Research Gate or other forums where it is appropriate to announce the software?
Relevant answer
Ofcourse, you can use thread or Q/A feature on Researchgate to announce and call for beta testers. I believe, Researchgate isn't the best for such and other social -academic/software forum and also GitHub forum groups and sites are available. Even on this thread and on LinkedIn, and stackexchange
  • asked a question related to Data Management
Question
4 answers
Can anyone help me with the Trial Data Management process?
Relevant answer
Answer
All SAS-related poster/presentation may be found at https://www.lexjansen.com/ Use their search engine for anything related to SAS.
  • asked a question related to Data Management
Question
10 answers
I have one independent variables with two group: Physically and Chemically Activated Rice Husk
And I have three Parameters(dependent variable) to test it's effectiveness in Water Purification. What type of Data Management, Processing, and Analysis should I use for my study?
The title of my study is: "POTENCY OF ACTIVATED CARBON FROM RICE HUSK AS A COMPONENT FOR WATER PURIFICATION"
Relevant answer
Answer
I would caution against MANOVA because it's not what you think it is.
A lot of people assume that a MANOVA works like an ANOVA but with several outcome variables at once. This is not in the least bit true. MANOVA mashes your outcome variables together into composite outcome variables. And furthermore, it does it in a way that you cannot control. Instead of giving each outcome variable equal weighting, or even allowing you to decide on the relative importance of each variable, it basis the weighting on the relationship of each variable to the predictor variables. As Kendal Smith puts it ANOVA and MANOVA do not analyze the same variables and, thereby, address different research questions.
See
Huang FL. MANOVA: A procedure whose time has passed?. Gifted Child Quarterly. 2020 Jan;64(1):56-60.
Smith KN, Lamb KN, Henson RK. Making meaning out of MANOVA: the need for multivariate post hoc testing in gifted education research. Gifted Child Quarterly. 2020 Jan;64(1):41-55
  • asked a question related to Data Management
Question
5 answers
Hello,
I am starting a project and a white paper on this regard. I'm curious on how new developments on Big data and every-day new platforms related to this new concept of data management is affecting current decision making processes and techniques inside businesses, corporations, non-profit organizations? Are we being hit only by the first and novel effects or is it the case that we are in the middle of the storm already?
I'd like to know from anyone how can shed some light on this topic.
Cordialement,
oa
Relevant answer
Answer
Decision making in companies and organizational leadership should be dependent on the key variables that determines the effect of the decision to be made. The volume of data (big data) may be of help but is not decisive for effective decision making. The top management should identify the point that require decision (purpose), what key factors are related to the purpose and what resources are required to make decision for that particular issue. Relevant data on each of these variables is most decisive than the bulky of data. Nonetheless, the big data is of importance in reviewing the over all performances of the company and effectiveness of the leadership. For specific decisions, specific but relevant data is required. So the level (scope) of big data that affect decision making process depends on the level (scope) of the decision that is to be made.
  • asked a question related to Data Management
Question
4 answers
I am collecting questionnaire data in an epidemiological study from large numbers of participants (>1000). What is the best data management system/software for: data entry, data validation and data checking?
Relevant answer
Answer
epi info is free and has backup help
  • asked a question related to Data Management
Question
5 answers
Who also uses eLabFTW for documentation of experiments and analysis? In what research area and what experience do you have?
Relevant answer
Answer
Hello Frank Krüger, thank you for your feedback, we recently started testing the eLabFTW in the workflow at our Electron Microscope. It would be cool if we could compare/exchange our methodology. It would be useful to consider a user meeting on this topic.
  • asked a question related to Data Management
Question
3 answers
In the form of an essay provide a critical analysis of data management within an organisation of your choice, showing the importance of effective and efficient data management systems and processes.
· Evaluate the decision making processes used in the business environment and how this is linked to their data management.
Finally link the evaluation of two communications theories into organisational decision making and data management procedures used showing how these rely on effective management and leadership practice.
Relevant answer
Answer
Oops sorry but I confused different Dr..Sharma's. The one I know has published in Decision Theory and Management Science quite heavily, but this one has a great deal of management experience. Mea culpa and apologies to all for my error. David Booth
  • asked a question related to Data Management
Question
2 answers
For example; Data should be collected and distributed in a single center, or there should be a data management portal for each province as local governments or a separate center for transportation.
What are your thoughts on this matter?
Relevant answer
Answer
I agree with Cenk. Directly sending emails to authorities may be a good choice based on my own experiences.
  • asked a question related to Data Management
Question
9 answers
Big Data is another trend in the world of ICT which has implication for information management. In this regard,can libraries manage Big Data in term of its volume,velocity, and varieties? The amount of data in the world today has been exploding. How could libraries help in managing Big Data?
Relevant answer
Answer
The use of Big Data Analytics database and analytical technologies can significantly improve the information offer of digitized online libraries. Analytics on digitized library resources using Big Data Analytics can significantly increase the possibilities of research in the field of book science. The results of these studies posted on the library website can significantly improve the information offer of the library website.
Regards,
Dariusz Prokopowicz
  • asked a question related to Data Management
Question
8 answers
Animal breeding is all about big data management and handling. The big data management is not possiable without application of computer in animal breeding. As data analysis and interpretation has pivotalrole in animal breeding. Hence, to have insight about the softwares used by animal breeders for data handling and analysis is necessory for a breeder is very important process.
Relevant answer
Answer
Thanks for your contribution and support @ Dr. Mohammad Ghaderzadeh
  • asked a question related to Data Management
Question
3 answers
Hi,
I am looking for open access tool for Master Data management.Using Talend Open Source for MDM, I am unable to create master data. Is there any other open access tool which could help me in this?
  • asked a question related to Data Management
Question
1 answer
  • What are the risks of an Industry 4.0 solution?
  • How is data managed at this moment in time?
  • Can the end user manage the velocity, variety, volume and veracity of its current information flow?
  • ​What would happen if this was tripled or quadrupled?
  • How does it support integration across the value chain?
Relevant answer
Answer
One risk is connected with security issues.
But organizing a data storage using the I4.0 principles gives more flexibility and the security part is a price to pay.
We've been developing an approach to design a data management system for measuring forecasting accuracy across many time series and across multiple horizons and forecasting origins. By following the I.4 principles the following framework has been developed:
The framework presented implements the so-called Forecast-value-added (FVA) analysis (see slide 6) while following the principles of interoperability, decentralization, real-time processing, service-orientation, and modularity.
Our general problem definition is given on slides 4-5. The I4.0 principles helped obtain a cross-platform solution for forecasting, which is simple to implement and learn, and fast in operation (applicable in production settings).
  • asked a question related to Data Management
Question
4 answers
Dear respected colleagues,
I would like to check how worthy it is to spend some time implementing and developing a well tested tool for simulating Blockchain-assisted Fog-enhanced systems. Initially, I suggest the tool would allow users to choose the layer of the Fog Computing architecture at which they would like to place the Blockchain. Further, the tool allows to choose one of several available consensus algorithms for simulating different cases for each scenario. The services that can be provided by the Blockchain in this tool includes, but not limited to, Computational services (Smart Contracts), Data Management, Identity Management, and Payment Services between systems entities.
If such a simulation tool is available, how likely it is that you would use it in your research?
Relevant answer
Answer
Thank you very much for your responses.
Following is a preprint of the proposed Fog-Blockchain Simulation Tool:
  • asked a question related to Data Management
Question
8 answers
>500 in the sample
Relevant answer
Answer
Why you don't try Matlab
  • asked a question related to Data Management
Question
4 answers
What are Weak supervision systems?
Please share your views, opinions, suggestions, recommendations and case studies. Thank you.
Relevant answer
Answer
Supervision depends on the knowledge statistically. Data, working strategies of the algorithm makes key difference, Indeed.
  • asked a question related to Data Management
Question
9 answers
I am writing a paper assessing unidimensionality of multiple-choice mathematics test items. The scoring of the test was based on right or wrong answers which imply that the set of data are in nominal scale. Some earlier research studies that have consulted used exploratory factor analysis, but with the little experience in data management, I think factor analysis may not work. This unidimensionality is one of the assumptions of dichotomously scored items in IRT. Please sirs/mas, I need professional guidance, if possible the software and the manual.
  • asked a question related to Data Management
Question
6 answers
I am currently using NVivo for my constructivist grounded theory analysis. I have large volumes of data and I am using a variety of different sources. I think it works really well as a data management tool, however, it is crashing at least once every two hours and the recovery isn't always the latest version. I would be interested to know if other researchers are experiencing the same issues? If not, I would be keen to know the spec of the PC/laptop they are using. I am currently using a Dell Inspiron 15 5000.
Relevant answer
Answer
Most of the links that Mary provided are to rather general discussions of NVivo, rather than to technical problems like you are experiencing. My strong impression is that NVivo is quite solid, so I would try to connect with the support team at QSR to get a resolution to your problem.
  • asked a question related to Data Management
Question
5 answers
In the Mental Health Department of the Navarre Public Health Service a Group Therapy Unit has been recently created (August 2018). We are gathering information about management, cost-effectiveness and clinical efficacy, but we would like to know if there are other similar experiences around the world.
We do know that group therapy is a common service provided by mental health departments, but always within wider inpatient or outpatient units that also provide other treatments.
Our inquiry is if there are other specific Group Therapy Units that specialize in providing only group format interventions.
We are interested in sharing data, management indicators identified, clinical and process variables assessed etc.
Relevant answer
Answer
It will be hard to conduct group therapy virtually!
  • asked a question related to Data Management
Question
3 answers
In our organization researchers work with simulations. These simulations have code, models and the data from these simulations. Each of them has different versions producing different outputs. All of these data is saved currently in Network Drives in a very disorganized fashion.
Are there any known tools/software which is used to handle such an scenario where these data can be easily linked to each other? simulation--data--code this link should be clear among different versions.
Relevant answer
Answer
No. You could try to standardize the simulation software but there are advantages and disadvantages to each. Your researchers are currently trying to pick the best software for their problems. If you want to maximize good research you let the researchers decide. Other than that, what is your goal? As a retired professor I can assure you that only allowing one statistical package on campus has never worked because not every package does everything. The same is true for simulation. The reason I use R now is because my university mandated SPSS which was of no help in my research. Thus I moved to R and I'm glad my university did that because I never would have discovered the power of R without the SPSS mandate. Other departments required everyone to purchase their department's favorite software. R is not only the most powerful it is free and open source. The question you are asking is really, which set of problems do you want to live with? With R I don't have either since I can ignore the mandate because of Rs power and cost. I recommend don't standardize(mandate) a single package for everyone. If you do then your simulation situation will be like our statistics one was. May the force be with you. David Booth
  • asked a question related to Data Management
Question
3 answers
Non SQL DB will improve data management and optimize consults in Big Data analysis on financial institutions? Any help will be thanked
Relevant answer
Answer
Possibly non sql DBs like MongoDB provide object based programming/scripting facilities for data storage and retrieval. It is the task of the developer to write script for optimization with respect to storage and retrieval. Some inbuilt function are there to move forward with optimization tasks, I think.
  • asked a question related to Data Management
Question
5 answers
I would love to see examples of your stats sheet/screen shot of your data collector.
I am also interested in what the data gets used for.
Thanks,
Rachel
Relevant answer
Answer
on paper
  • asked a question related to Data Management
Question
9 answers
Data Mining and Big data cover the subject of Artificial intelligence or these terms also discuss in the context of Data Literacy or Data Management in the context of Library and information science?
  1. Do librarians data literacy skills remain the same as the Data Scientist skill? If data scientist skill the higher than Librarian data literacy skills inf future librarian job market replace by the librarian?
  2. What should librarian do to enhance data literacy skills ?
Any study (Dissertation, Model, Conference Paper, Poster discussed the data literacy in the context of AI (Big Data and Data mining) application in Library (ies).
_Yousuf
Relevant answer
Answer
Hi Pohammad,
Big data/data mining, business intelligence/analytics, data science, distant reading, knowledge discovery etc. are all different terms used in different disciplines to denote essentially the same thing: statistical analysis and discovery of novel patterns from data and presenting them in the form conducive to human consumption. Meliha
  • asked a question related to Data Management
Question
3 answers
In my openion, the chemical industry is one of the most important industries in the world. Not only do 90% of our everyday products contain chemicals, but the industry also employs approximately 10 million people. Naturally, they were one of the first to embrace digital technologies such as process control systems or sensors which have a long tradition in production.
A continuous digital transformation plays a crucial role in several key aspects of the industry. Accenture has identified the six most influenced areas.
  • Higher Levels of Efficiency and Productivity
Increase competitive advantages and further decreases the costs through operational optimizations.
  • Innovation through Digitalization
Helps boost the productivity in R&D and thus decrease the time till market entry.
  • Data Management and Analytics
An improved understanding of customer needs and so the optimization of offerings are integral contributors to any company’s success.
  • Impact on the Workforce
Tasks and opportunities, as well as the job requirements, will experience a change with digital transformation. “Finally, technology will take an even greater role in upskilling and training employees, and in knowledge management.” (World Economic Forum & Accenture, 2017)
  • Digitally enhanced offerings
An increasingly important aspect of product performance especially for close-to-end customer markets.
  • Digital Ecosystems
Being separated into Innovation, Supply and Delivery, and Offering Ecosystem.
Industry 4.0, however, does not only include aspects of digitalization but mainly artificial intelligence, robotics, Internet of Things (IoT) or advanced materials. In the table below their impact on chemical products is illustrated.
Please elabotate are your thoughts about it also.
Relevant answer
Answer
Thanks, clear!
  • asked a question related to Data Management
Question
5 answers
I have a dataset in csv format with over 22k rows and I need to transpose it. An excel sheet is limited to 16384 columns; so it doesn't let me transpose. I have spent much time to transpose it until I realised I can't do it due to size limit of Excel sheet.
I have transposed the data now in Matlab. But I can't export the data out since the excel sheet can't handle it. Working with excel sheet is much more convenient for me for data management and data storage for future reference, although rest of my work is in Matlab.
Can anyone suggest any alternate way out so I can store the transposed data as a CSV file on my cloud?
Thanks.
Relevant answer
Answer
Thank you everyone. All the answers were really helpful to cross my hurdle.
  • asked a question related to Data Management
Question
3 answers
i'm working on a real time data management system for social media data. the goal of this work is to improve the process of decision-making in E-government policy. This requires knowing what the best database for this kind of system MongoDB or any other please suggest me.
Relevant answer
Answer
Cloud era
  • asked a question related to Data Management
Question
4 answers
I coordinate field operations for a farm with more than 5000 acres of blueberries and cranberries, divided into a number of fields ranging from 10 - 125 acres each. I was hoping to bring in a new system of data management for better grip on money and money out of each field.
By data, I mean chemical input, equipment usage, labour charges, and yields to have a better grip over every part of farm. This will help us plan variable applications and management.
As of now, I am looking for a GIS solution/software for data management, visualisation, analysis (to support third party statistical software), and decision-making.
Relevant answer
Answer
QGIS
  • asked a question related to Data Management
Question
4 answers
Dear colleagues, please help out with this challenge.
Downloaded Landsat 8 imagery for my study location from USGS earth explorer. I extracted the zip folder and added to Arcgis using
Data management>Raster>Raster processor>composite bands
I input the bands in order 754 and added an output location. Then I ran the analysis.
Alas, it doesn't perform the analysis. A white background appears in place of composite image. I repeated for bands 431 but got same results.
I need suggestions on what to do to get a composite of the bands for land use analysis.
Many thanks
Relevant answer
Answer
Dear Frank,
try to open the Landsat images in ArcMap via "Add Data" --> Browse to the unzipped folder --> chose the "..._MTL.txt" that has a raster symbol.
Arcmap will now recognize the different bands and will stack them. You can now export the data and save them as a composite.
If you downloaded the uncalibrated data you can use Arcmap in addition to perform the Top of Atmosphere (TOA) Correction. To do so, open go to "Windows" --> "Image Analysis" --> select the data, e.g. "Multispectral_LC8..." --> "Processing: Add Function" --> rigth click "Stretch Function" --> "Insert Function" --> "Apparent Reflectance Function" --> Check "Albedo", --> In "General" tab chose "32 Bit Float". Now save and export the new raster, named for example "Func_Multispectral_LC8..."
Some links that migth help:
  • asked a question related to Data Management
Question
3 answers
is there any new data on management of mild cognitive impairment
Relevant answer
Answer
" Technology-based cognitive training and rehabilitation interventions show promise, but the findings were inconsistent due to the variations in study design. " [Ge & Zhu & Wu 2018]
  • asked a question related to Data Management
Question
15 answers
We have 40 probes out in rivers and streams collecting conductance, water level, time, and temperature. These will be taking measurements every 5 minutes for 2 years. LOTS OF DATA! Any recommendations for data management programs, would be good. Our data analyst wants to use excel.... I dont think that will cut it.
Thanks
Relevant answer
Answer
Meanwhile, water quality monitoring has been evolving to the latest wireless sensor network (WSN) based solutions in recent decades. This paper presents a multi-parameter water quality monitoring system of Bristol Floating Harbour which has successfully demonstrated the feasibility of collecting real-time high-frequency water quality data and displayed the real-time data online. The smart city infrastructure – Bristol Is Open was utilised to provide a plug & play platform for the monitoring system. This new system demonstrates how a future smart city can build the environment monitoring system benefited by the wireless network covering the urban area. The system can be further integrated in the urban water management system to achieve improved efficiency.
  • asked a question related to Data Management
Question
3 answers
I have 8 tiles of Aster DEM, which are having elevation range 212 m to 1766 m from all the tiles. After Mosaicing (Mosaic- Data Management tool in ArcGIS 10.3)all this tiles, the elevation range get suppressed to 228 m to 1108 m. Why it is happening ? Is there any other relevant method ?
Relevant answer
Answer
If the datasets are big, then it can be happened. What type of RS images you are using MODIS, Landsat or Sentinel?
  • asked a question related to Data Management
Question
2 answers
My objective is to find the relevancy of using big data in the science and technology sector of Ethiopia and knowing how to implement it. As the science and technology sector reaches many industries, i want my research to give a basic guideline in utilization of big data technology in this area.
Relevant answer
Answer
Thank you Professor Ette Etuk . As Ethiopia is somehow new to the technology, i wanted to generalize the management framework of big data, not just the analytics. I understand the analytics is an important part of big data but i wanted to start from the beginning Is there such a method that i can recommend as a policy guideline??
  • asked a question related to Data Management
Question
5 answers
Hello amazing colleagues,
I'm MBA student, have a huge passion about Innovation especially (machine learning and AI), and I have experience in construction field in Middle east as System support and data management. So I picked this subject for my dissertation this Oct. I hope to make distinguish work, which could help me to secure a scholarship for PHD in the same subject.
So I'm seeking your help and advice for what and where I can link between these two topics, and what are the new research area between ML/AI and construction, and what is the new trend so I can work on.
I know your time is really valuable, so I'm really appreciate your help and advice.
Relevant answer
Answer
Basically anything you want to predict, estimate can be used in Construction industry. Some particularly application like: Fault detection (before/during/after construction); recommendation on architecture, color, etc.; price/construction time/resource prediction, when is the best time to start construction etc.; 3D reconstruction from 2D architecture, and many more. Many of these applications are related to computer vision.
It seems like some recommendation system are more related to your background.
Disclaimer. I am not expert in this topic but maintain a general interest on AI's applications.
  • asked a question related to Data Management
Question
5 answers
Recently, I wanted to include a variable into a data table using mutate function in R. Unfortunately, I ended up completely losing the columns of the original data set in the process. Is it possible to undo any manipulations in R? How?
Relevant answer
Answer
Usually, unless you're perfoming time consuming operations like multiple imputation of missing data, loading the original data and doing all data transformation "on-the-fly" (to repeat all manipulations) is done very quickly, especially with packages like dplyr, tidyr or data.table.
If you really want to save interim steps, simply save the next steps in another object (data.frame), and finally save your complete workspace. There's a short, helpful SO-thread on this topic: https://stackoverflow.com/questions/34539011/saving-in-r-studio
I really recommend 1) using projects; 2) importing the raw data; 3) doing all data wrangling (manipulations) by writing a clean, commented script. Optionally, 4) save your workspace. Unlike other statistical software packages, R has the advantage that you're less often in the situation to save all your changes, as beginning from scratch by running your script is very fast.
  • asked a question related to Data Management
Question
5 answers
I have got data on Pressure Injuries. Each row tells me how many pressure injuries each person has. Then I also have variables such as toe, foot, hand, and so on, which specify the stage of pressure injury. I want to break these variables apart but I am missing something. Current data is as follows:
ID No of PI Foot1 Arm Hand
1 1 Stage1
2 3 Stage 3 Stage 3 Stage 1
3 2 Stage 4 Stage 1
I want the data to look like:
D No of PI PI 1 Location PI 1 Stage PI 2 Location PI 2 stage
1 1 Foot 1 Stage1 Arm Stage 1
2 3 Foot 1 Stage 4 Hand Stage 2
3 2 Foot 1
I have tried using the recode and compute function and it works for the first pressure injury. But it doesn't seem to work for the second and so on.
Can you please offer suggestions, thanks in advance .
Relevant answer
Answer
Please refer to:
1. Landau, S. and Everitt, B. S.(2004). A Handbook of Statistical analyses using SPSS.
2.Howitt, D. and Cramer, D.(2008). Introduction to SPSS.
Regards,
Zuhair
  • asked a question related to Data Management
Question
4 answers
I graduated from business school with a MBA degree years ago. It has been more than 5 years since I worked as an IT marketing specialist but I think I may be able to learn more skills about data management /data mining/analysis so as to work as a data management specialist to do more research on data mining/management/analysis regarding health care. I have basic knowledge of statistics, database and research methods and keep working on certified health informatician Australiasia (CHIA) at this moment.
Please give me any clue that may help me out of the disorientation.
Relevant answer
Answer
Hi Allen,
With your credentials and background, you have plenty of opportunities. Having been in both industry and academia myself, I think your background is commensurate with both paths. If you consider industry, then you might look at corporate research and development (R&D) or some similar occupation where research and statistics are necessary, such as market analysis. Outside academia, you might also try to align yourself with a research-based think tank and pursue grant funding.
Good luck!
--Adrian
  • asked a question related to Data Management
Question
4 answers
Currently, I am encountering problem in data management to be analyzed, I had limited resources, however i had been advised to use data triangulation to data normalization and management.
Relevant answer
Answer
  • asked a question related to Data Management
Question
5 answers
I have thinking that have problems of data management within and sharing across the hospitals, that some hospitals have different systems so which factors cause to systems to share the data?
Relevant answer
Answer
Do you want to study that how hospitals in Tanzania are using data management system within hospital and how they share or you want to study that how to start this in hospitals of Tanzania?
  • asked a question related to Data Management
Question
4 answers
The cloud technology features its computing and storage capacity, and high reliability. The question is how we refrigeration and air-conditioning (RAC) professionals use the cloud computing, and cloud storage effectively and efficiently for our parts sizing, design calculations, test data processing and storage, and drawing data management?
Relevant answer
Answer
Thanks, Himadri! It's good information from China.
  • asked a question related to Data Management
Question
5 answers
I have 4 populations of different races that I am following for a period of time. I am looking for the incidence of a particular condition (lets call it condition A) after an event x. (lets say joining a particular business)... I find that one particular population (lets call it population k) has significantly lower incidence of the condition A compared to other populations . Upon further analysis I realize that event x occurred in most members of the population k a lot later than the other three populations.. (ie most members of the business joined the business a lot later than other 3 groups).
I want to know whether the lower incidence of condition A is due to the late occurrence of event x or do they actually have a lower incidence.
How do I approach this problem? I am thinking of getting kaplan meier curves of condition A in all 4 subgroups..what do I do there after?
Relevant answer
Answer
nice information thank you
  • asked a question related to Data Management
Question
17 answers
I am facing problem with GSEM in Stata because when I add my variable and run it, it takes longer time and still does not converge (all my variables are categorical). Then I have used SEM with these variables (16 latent and 4 observed) which are mostly ordinal and few binomial type. Total number of data I have used is around 400. I am using 13.1 version of Stata. Can I use SEM to analyse these variables? I am modelling these variables in SEM Builder instead of coding (model attached for your reference). Can you also suggest me the best method to create the reduce model in SEM output? I have multiplied the path coefficient of variables and done it manually.
Relevant answer
Answer
nice information thank you
  • asked a question related to Data Management
Question
3 answers
Can you please recommend any good open sources software for geotechnical borehole logs, field and lab testing and geotechnical modeling.
Relevant answer
Answer
You can check the link below:
  • asked a question related to Data Management
Question
5 answers
I need this book,please any one help me---"Data Provenance and Data Management in eScience", Qing Liu , Edited by  Quan Bai , Edited by  Stephen Giugni , Edited by  Darrell Williamson , Edited by  John Taylor ?
Relevant answer
Answer
Ahmed - this question has been answered. Not sure why you make your statement. The book can either be loaned or purchased.
  • asked a question related to Data Management
Question
7 answers
I have 38 semi-structured interviews and this will be my first time using a software program to conduct my coding. Usually I use a whiteboard and post-its but with the richness of these data, it is getting unwieldy. I would like a more efficient way to visualize things across participants. Thanks!
Relevant answer
Answer
Use Qiqqa, it has autotags as well as multiple tags for you to do your own coding. It will automatically suggest themes! It also does Brainstorms.
  • asked a question related to Data Management
Question
14 answers
We had estimated 600 samples for one study with national representation. In some districts due to the purpose of pilot intervention we collected additional 196 samples with same tools and methods. Would it be appropriate to include all the samples for national estimation?
Relevant answer
Answer
The concern is that the pilot samples are (in some way) sampling a different population than the full study. Maybe the pilot survey was done just prior to a very successful national ad campaign. The campaign changed the proportion of people that use that type of product. Say the survey was on those people who drink orange juice. The pilot survey was sampling a population where 80% of the people had at least one glass of orange juice per day. After the campaign, half the people that used to drink orange juice now drink cranberry juice. So the pilot and full surveys have very different populations.
One solution is to break the analysis into parts.
1) Analyze the full survey as one part.
2) Reanalyze the data with the pilot survey results included. Add a blocking variable coded as 0 if there is no pilot data for that area, 1 for pilot data, 2 for survey data where pilot data is also present. I would delete block=1, and ask if there was a significant block effect (is block=0 any different from block=2)? I would delete block=0 and ask if there was a significant block effect. If the answer in both cases is no, I would combine the data.
  • asked a question related to Data Management
Question
4 answers
I've been working on composting standard and composting process. Liquid waste (LW) is nowadays often uses as a feedstock. Although there is interest in using it, there are also several characteristics that would increase the risk for the composting process and impacts (air, water, soil). This is where empirical data and management practices need to be assessed to balance the risk of the use of LW in composting industry. Any current research or recent paper related to the process of LW in compost is welcome. Thank you
Relevant answer
Answer
Hi Ifeanyichukwu, this article appears really interesting and I made a request to get the full text. Thank you for sharing. Have a nice day. Regards
  • asked a question related to Data Management
Question
7 answers
I'm supplementing an already published genetic database with genes extracted from full mitochondrial genomes from genbank. I have all accession numbers for the complete genomes and I have previous gene sequences too.
I'm using R to do all the downloading and data management but i'm willing to use other open access tools to do the job. So far, my plan is to just download the full genome, replicate it in separate fasta files for each mitochondrial gene, align all the sequences for that specific gene and then trim sequences manually. I could also go genome by genome extracting the genes manually on genbank, but I figure there is a smarter and quicker way to do it.
I have never done anything like this, so I'm playing by ear here. Any tips are welcome.
Relevant answer
Answer
Found it: "AnnotationBustR: An R package to extract subsequences from GenBank annotations" Here: https://peerj.com/preprints/2920.pdf  
  • asked a question related to Data Management
Question
8 answers
Does anyone have some useful data management tips for qualitative case studies? I have 20 NGOs with staff interviews, observation, and document analysis occurring at each NGO. There is going to be a huge amount of data and I need to be able to cross reference and corroborate different data sets with one another. I will be using nVivo to code the documents and interviews and observations will be linked to organisations and individuals. I also need to keep track of NGO and interviewee demographic details, interview and observation dates/hours/locations etc, and types of documents (formal/informal, audience, author, purpose etc). So much data! I was thinking I might use nVivo for managing most of it as I can link observations and demographics to transcripts and documents. I've just been using excel to keep track of the other data (eg number of times an interviewee has been interviewed/duration/location etc) however I don't know that I've set that spreadsheet up in the best way possible. It would be great if anyone has good ideas! Thanks! 
Relevant answer
Answer
Dear Leanne,
I was in your situation 20 years ago with the same number of companies (20) and mixed data (interviews, observations, financial reports and survey results).
There are several major elements to handle the information
1) first, you create a standard SPSS file presenting each interviewee as a separate case (line) and adding some specific variables obtained from interviews and common variables for interviewees from the same companies from observations and documents. In this way, you will have the possibility to run different statistical checks on concordance/disagreement of opinions of interviews on the same topic, and also to include all data about a particular interview (date, time, etc.)
2) to make such a database useful you must create specific variables to present your impressions from observations and general results from interviews  as numerical variables. I made such variables as "SOC" for my impressions from interviews with CEOs on their preoccupation with employees needs, GOV for my impression from interviews with CEOs on their relations with local authorities etc.
3) Equally important is to design specific metrics for particular actions that reflect your general impression on a NGO resulted from the whole set of your data. For this you may look at the enclosed file. 
Success!
Igor Gurkov
  • asked a question related to Data Management
Question
3 answers
I have baseline and follow up group, they are same people (paired) but in each group I have three categories, so to test the changes I cannot apply MacNemar test since it works for 2x2
Relevant answer
Answer
I presume you mean 3 x 3!
When we extend the McNemar test, there are two hypotheses we can test. The first is asymmetry - that the off-diagonal values are asymmetric. The second is marginal homogenity - a test that the frequencies along the table margins are similar.
In genetics, the test is called transmission/disequilibrium test (TDT) and is used to test the association between transmitted and nontransmitted parental marker alleles to an affected child. I don't know if SPSS does the test. For sure, Stata and R will do it.
  • asked a question related to Data Management
Question
2 answers
An experimental field was conducted at Mehoni ARC, Ethiopia having five varieties, five levels of harvesting age and three numbers of nodes. all the required data was collected properly. However, during analysis the main effect mean comparison was displayed with their respective letter but the two way and three way interaction was not. therefore, if there is any other option please remind me.
Relevant answer
Answer
when using 0 1 coding, like genmod, the main effect only show the difference among the reference group, when interaction is used. If 1, -1 coding, the main effect value is not the difference with reference group, you need find the parameter by sum of others levels than multiply by -1. You can use contract+estimate option to get overall comparison as Firas's example, or conditional comparison. The main effect for categorical measure difference, the two ways interaction measure difference of difference, if the difference by one factor the same across the level for other factor. Three way interaction measure difference of difference of difference.
  • asked a question related to Data Management
Question
7 answers
Hi all.
I have a variable in stata which has the following structure:
[t, Variable]
1   0
2   0
3   0
4   1
5   0
6   0
7   0
8   2
9   0
Where 1 and 2 shows the starting and the ending period for an specific event, respectively.
I would like to know how to compute the length of time for the event using stata, which would be 5 periods for this particular example.
I know this could be easily achieved in R with a very simple code like the one I present below:
A = rep(0,20)
A[c(4,15)] = 1
A[c(6,19)] = 2
Begin        = which(A==1)
End           = which(A==2)
Total.Time.of.Event = rep(NA,20)
for(i in 1:length(Begin)){  
    Total.Time.of.Event[Begin[i]:End[i]]  =  End[i] - Begin[i] + 1
}
cbind(A,Total.Time.of.Event)
Thus, the output for the cbind() instruction would be:
          [A ,Total.Time.of.Event]
[1,]    0    NA
[2,]    0    NA
[3,]    0    NA
[4,]    1    3
[5,]    0    3
[6,]    2    3
[7,]    0    NA
[8,]    0    NA
[9,]    0    NA
[10,]  0   NA
[11,]  0   NA
[12,]  0   NA
[13,]  0   NA
[14,]  0   NA
[15,]  1   5
[16,]  0   5
[17,]  0   5
[18,]  0   5
[19,]  2   5
[20,]  0   NA
Any help with this would be greatly appreciated,
Best regards
Juan
Relevant answer
Answer
You can use the egen command in stata.
Can you please give me the English syntax. This may help me provide some more assistance....
I mean something like:
[Sn,] A Duration
  • asked a question related to Data Management
Question
11 answers
Dear colleagues, 
Likert scale is widely used in medical and social sciences, it usually gives participants five responses, 1: Strongly Disagree, 2: Disagree, 3: Neutral, 4: Agree, 5: Strongly Agree. how should neutral responses be understood and what should be entered to SPSS data sheet? in more simple words, to determine the percentage of agreement or disagreement with certain item, are all Neutral response will be omitted?
Thank You
Relevant answer
Answer
  • asked a question related to Data Management
Question
3 answers
I wanted to know the exact steps in doing Recurrent Event Data Analysis in excel. I have time to event data . I want to find the distribution and parameters after checking the data for IID.
 Any experts please help me with that.
Relevant answer
Answer
I do have a question on no Parametric method. If we use MCF method how to estimate the future failures and what will be the distribution . Basically I want to do system modeling using monte carlo. so my excel sheet is based on reliability equations which is different for each distribution
  • asked a question related to Data Management
Question
16 answers
There are certain sayings in research that data become normal when the sample size is large. What is the sample size for assuming the data to be normal? Andy field refers to sample size above 30 as large data in his book (if i am right) which seems to be more applicable to a medical science data. But in case of social science research what would be the larger sample size for assuming normality. Is there any literature for the same?
Relevant answer
Answer
Dear Rajkumar,
I agree with Rudolf. It depends on the distribution and the statistic you are trying to obtain.
As per my knowledge, for climatic variables, the period of 30 years is considered as a standard to assess the changes. For Mann-Kendall non-parametric test, which is used to calculate trend, if the no. of years is greater than 10, the test assumes it to be normal.
Furthermore, when you go for normality tests for small dataset, I have observed from various literatures, if your n > 50 you should use kolmogorov-smirnov, and if n<50, should go for Shapiro-Wilk. Again, as described by Rudolf, the rate convergence remarakably varies for different distribution also.
You can refer the book "The SAGE Encyclopedia of Social Science Research Methods" for various statistical queries.
I hope this helps.
All the best !
  • asked a question related to Data Management
Question
3 answers
Hello! My outcome is the ceod index, is an average or median to population level, but based on count individual data. The ceod index is the addition of teeth with decay, teeth with extraction indication and filled teeth by individual. Then, one calculates the average ceod for the population in study. Therefore, the distribution is with excess zeros (about 30% of the observations) and over-dispersion.
Relevant answer
Answer
Dear Maria Jose Monsalves 
please check the resources
  • asked a question related to Data Management
Question
9 answers
I am trying to transform a vector dataset using the BoxCox command in R which contains a few 0 values and the result shows "Transformation requires positive data". After doing further internet study I found that the two parameter Box-Cox transformation is suitable for my case. I have tried the code as following:
X4.tr <- boxcoxfit(X4$x, lambda2 = T)
where X4 is the dataframe which contains the vector of dataset "x".
This yields in the optimized value of "lambda" and "lambda2" however, I am not sure what code to use in order to transform the whole vector of data in x although I tried using the following code to transform:
X4.trans <- BoxCox(X4$x, 0.71027)
where 0.71027 is the value of lambda estimated using the boxcoxfit function. This results in negative value for the 0 values of the original vector. After changing the lambda value in the above code with the lambda2 value identified by boxcoxfit, yet it results in negative values for 0. 
I would really appreciate suggestions for the possible code to perform the two parameter Box-Cox transformation is I am on the wrong path.
Thanks 
Relevant answer
Answer
###  Check over the following for any errors before use
if(!require(geoR)){install.packages("geoR")}
Turbidity = c(1.0, 1.2, 1.1, 1.1, 2.4, 2.2, 2.6, 4.1, 5.0, 10.0, 4.0, 4.1, 4.2, 4.1, 5.1, 4.5, 5.0, 15.2, 10.0, 20.0, 1.1, 1.1, 1.2, 1.6, 2.2, 3.0, 4.0, 10.5)
hist(Turbidity, col="gray")
library(geoR)
T.box = boxcoxfit(Turbidity, lambda2=TRUE)
lambda = T.box$lambda[1]
lambda2 = T.box$lambda[2]
if(lambda==0){T.norm = log(Turbidity + lambda2)}
if(lambda!=0){T.norm = ((Turbidity + lambda2) ^ lambda - 1) / lambda}
hist(T.norm, col="gray")
  • asked a question related to Data Management
Question
4 answers
If the study has been powered at 80%, and a sample size is estimated to be 79 subjects per group, but eventually the size of each group ends up being 140; what would be the impact of this on the study results?
Relevant answer
Answer
There are several ways to interpret this question. I will assume that you are generally familiar with the effect of sample size on power, accuracy, specificity, type I and type II errors, and so forth. You would have had to go through this when planning the original sample size.
There are many good things that happen with increased sample size. There is one bad aspect to increased sample size. With the larger sample size you are able to detect smaller treatment effects. So you can find a statistically significant difference that is too small to be biologically meaningful. However, there is seldom enough information about the system being studied to be able to balance the treatment effect against all other sources of biological variability. Typically biological experiments in the lab are too controlled, and this can result in unexpected field results following promising laboratory experiments.
Maybe you want to quantify the benefit realized from increasing the sample size. In this activity I would resample the data and run the analysis with several thousand random selections of 79. You now have a distribution of what might have been the outcome if you had stopped at 79 replicates. On this graph plot the results using all the data to show the effect of having increased the sample size. A less powerful approach would be to compare the results from the first 80 to the results using all the data. Had things worked differently, one would presume that you would have simply stopped with the first 80 in the original design.
In terms of formulating the null hypotheses, increasing sample size will have no effect. However, sometimes increasing the sample size can enable you to look at factors that you would have had to ignore with the smaller sample size. Are there differences between females and males, ethnic differences, age, weight, and so forth.
Did you means something else?
  • asked a question related to Data Management
Question
2 answers
I have been working on ASI unit level data for period from 1983-84 to 1997-98. I am unable to detect the duplicates in these datasets. Is there anyone who else worked on ASI data and can help me on this issue?
Relevant answer
Answer
its done now. I need to know if anyone has used ASI data of eyar 1994-95
  • asked a question related to Data Management
Question
6 answers
I want to evaluate cloud security and data management
Relevant answer
Answer
Dear Mohammad
Go for CloudSim. You can simulate most of the aspects required with respect to cloud in  that. But there are some limitations in CloudSim with respect to the working when we compare with real cloud environments.
Best Regards
  • asked a question related to Data Management
Question
3 answers
I have two columns of data. First column is H (solar radiation), second column is T (temperature). And I know the relationship between them. The relationship is H =a*[1-exp⁡{-b*(T)^c}]. a, b and c are emprical coefficients. And these coefficients are constants. I want to find best fitted emprical coefficients in this data set. I have got 3081 {H,T} values in Excel. Thanks.
Relevant answer
Answer
Quick and dirty approach:
1. estimate the value of a: it is equal to H at very high T
2. plot log(-log(1-H/a)) vs. log(x)
You should see a straight line.  If not - adjust (increase) a slightly until success.  The slope is then equal to parameter c, as the straight line equation reads: log(-log(1-H/a)) = log(b) + c*log(T).  The value of b may be found easily then, too.  But don't ask me what are the uncertainties of so obtained a, b, and c.  Anyway, it should be a pretty good start point (initial guess) to more advanced computations.
  • asked a question related to Data Management
Question
4 answers
The cohorts answered questions related to learning styles. The Internet collecting group gave total for each cohort to the questions that were on a scale from strongly agree to strongly disagree.
Relevant answer
Answer
Thanks, Stephen. I will let you know what I discover.
  • asked a question related to Data Management
Question
14 answers
Hello,
I'm working with string variables in SPSS and encountered a problem in managing the data. My variables are codes assigned to each observation, where observations are turns of speaking in a discussion. Sometimes an observation has multiple codes assigned to THE SAME variable (see the picture), i. e., I have more than one value in one cell (in POSFEED - PR-COG, PI). I need to spread out my codes so that each observation contains only 1 value in each cell, i. e. instead of POSFEED I want to have PR-COG and PI as 2 new variables with 0 and 1 in the cells. The problem is this: when I use the RECODE syntax, SPSS does not recode the cells which contain more than 1 value. I understand why, because when using the statement
RECODE POSFEED ('PI' =1)  INTO PI.
EXECUTE.
PI, PI in the same cell does not equal to only one PI.
However, I have a lot of data and want to avoid recoding things manually. I tried using different logical functions and statement, but none of them seem to solve my problem. Can anybody suggest any solution to this?
Thank you!
P. S. My thinking is that an operator for "contains" instead of "equals" ("is", =) could solve the problem, but I can't find it anywhere.
Relevant answer
Answer
Hi,
You can use INDEX to test for substrings.
COMPUTE PI = INDEX(POSFEED, "PI") > 0.
  • asked a question related to Data Management
Question
1 answer
BDMS = Big Data Management System
BVT = Best Version of Truth
BDM = Big Data Management
MDM = Master Data Management
Relevant answer
Answer
At some point of time "yes", please refer to "CAP theorem" (https://en.wikipedia.org/wiki/CAP_theorem) and various variations of consistency, especially you will be interested in "eventual consistency" (https://en.wikipedia.org/wiki/Eventual_consistency).
  • asked a question related to Data Management
Question
2 answers
Dear Researchers
I have the unit value index of exports data for X country from 1972 to 2015, with different base years but My question is how can I change the base year, for instance to change it be year 2000?
Relevant answer
Answer
Ok thanks a lot Dr Bento