Science topics: Computer ScienceDatabases
Science topic
Databases - Science topic
A database is an organized collection of data. The data is typically organized to model relevant aspects of reality in a way that supports processes requiring this information.
Questions related to Databases
I have some difficulties in finding a good number of DICOM files for SSM...
This is advised nowadays to submit a dataset to a publicly available repository (eg. Mendeley) before publishing a paper done on these data. Can I reuse such a repository dataset to publish my second paper? Can anybody else use my dataset and publish his/her paper basing on my dataset without my acceptance?
I am looking for the volume of public fund to research topics over time in each country. Is there a reliable database that indexes the public funding allocation into research theme or topics (Particularly US)?
For example, the volume of public funding for "electric battery related research" over the past 30 years in the US.
I'm looking for a food picture database to use in designing a behavioral task. I would be interested in controlling the degree of knowledge and nutritional value of the food. Thank you in advance.
Suggestions of online databases/tools I can use to verify candidate genes
What are the best database for blood cell images for research ?
We are conducting a Systematic Literature Review and we would like to know how to merge the different results in a unique database as to easily recognise duplicates. Merging excel files seems not to be an immediate procedure.
I know of several for the gas phase (e.g. HITRAN, GEISA, PNNL) but not the condensed phase. Most seem to be proprietary databases for matching spectra, but don't allow determining absorption as a function of density or path length.
Hi guys,
I am looking for suggestions/recommendations from the research community regarding public databases that are most commonly used by researchers in their analysis.
Just like GEO, GTex, TCGA, Gnomad, TopMed etc, even databases from other countries besides US.
#genomics #publicdata #genomicdatabases #databases #datamining #TCGA #HCA #GTEX #GEO #ARRAYEXPRESS
How to obtain currently necessary information from Big Data database systems for the needs of specific scientific research and necessary to carry out economic, business and other analyzes?
Of course, the right data is important for scientific research. However, in the present era of digitalization of various categories of information and creating various libraries, databases, constantly expanding large data sets stored in database systems, data warehouses and Big Data database systems, it is important to develop techniques and tools for filtering large data sets in those databases data to filter out of terabytes of data only information that is currently needed for the purpose of conducted scientific research in a given field of knowledge, for the purposes of obtaining answers to a given research question and for business needs, eg after connecting these databases to Business Intelligence analytical platforms. I described these issues in my scientific publications presented below.
Do you agree with my opinion on this matter?
In view of the above, I am asking you the following question:
How to obtain currently necessary information from Big Data database systems for the needs of specific scientific research and necessary to carry out economic, business and other analyzes?
Please reply
I invite you to the discussion
Thank you very much
Dear Colleagues and Friends from RG
The issues of the use of information contained in Big Data database systems for the purposes of conducting Business Intelligence analyzes are described in the publications:
I invite you to discussion and cooperation.
Best wishes

Below are some issues related to Big Data database technologies that can be developed scientifically:
- Application of data processing technology in Big Data database systems for modern education 4.0,
- Improvement of forecasting of natural, climatic, economic, economic, financial, social etc. phenomena based on analyzing large data sets,
- Analysis of sentiment, opinions of citizens, Internet users regarding brand recognition of companies, customer reviews of specific services and products, views on various topics, citizens' worldview based on the analysis of large collections of information downloaded from various websites, from comments downloaded from social media portals,
- Analysis of information and marketing services of commercially operating companies that carry out specific analyzes of sentiment, citizens' opinions, Internet users regarding brand recognition, customer reviews of specific services and products etc. on behalf of other companies that purchase specific analytical reports,
- Analysis of the possibilities of cooperation, synergy, correlation, conducting interdisciplinary research, connecting Big Data database systems with other information technologies typical for the development of the current fourth technological revolution called Industry 4.0, which include technologies such as: cloud computing, machine learning, Internet of Things, Artificial Intelligence, etc.
In what other areas are the technologies of processing and analysis of information in Big Data database systems used?
Please answer
Best wishes
Dear Colleagues and Friends from RG
The issues of the use of information contained in Big Data database systems for the purposes of conducting Business Intelligence analyzes are described in the publications:
I invite you to discussion and cooperation.
Best wishes

Hi
I'm working in Diagnosis Location of aphasia lesion of Stroke patients and I need to database of MRI images for aphasia Patients.
I have "x'pert highscore plus (3.0.5)" software to plot my samples' XRD patterns. To check for the reference, software cannot find the exact match. So, I need to add Reference databases to the software. How can I do it?
What is the proper method for this particular version?
What kind of scientific research dominate in the field of Big Data database systems?
Please, provide your suggestions for a question, problem or research thesis in the issues: Big Data database systems.
Please reply. I invite you to the discussion
Dear Colleagues and Friends from RG
Some of the currently developing aspects and determinants of the applications of data processing technologies in Big Data database systems are described in the following publications:
I invite you to discussion and cooperation.
Best wishes

What kind of scientific research dominate in the field of Popularization of science on the Internet?
Please, provide your suggestions for a question, problem or research thesis in the issues: Popularization of science on the Internet.
Please reply.
I invite you to the discussion
Thank you very much
Best wishes

In some countries, work in the scientific field is paid mediocre. Young people have little interest in the complex problems of science. The popularization of science is necessary. The pages of the RG publish questions and answers to these scientific questions.
Can the publication of answers to questions in the RG be considered a popularization of science?
Do you know any databases that not only specifies the plant origins of a specific phytochemical, but also demonstrates how much of that substance may be extracted from some specified parts of the plant?
I have also found this awesome website but it doesn't work at all! Beside answering my question, could you please let me know if you could get any results by searching a term in it.
Hello,
I am teaching a database systems class and I wish to guide students on how a distributed databases work.
We are using postgres DBMS for illustration.
What other applications do I need to setup a DDBMS in one windows OS machine.
Best
Derdus
Hello everyone,
I have been having problems with gathering all the information that I need for my study and also, some problems with fixed effects.
1) First of all, I am interested in comparing exports of Panama to certain pair countries (of Central America and some other). I am using panel data in stata (1994-2017) and variables of interest are:
1) Exportations (y)
2) Distance of the capitals
3) GDP (c_origin)
4) GDP (destination_c)
5) Population (c_origin)
6) Population (destination_c)
-->+ some dummy variables
7) common language (0,1)
8) borders between countries (0,1)
9) whether the destination country belongs to Central America (main variable of interest; 0, 1)
Is there any page where I could get most of the data? I used CEPII and could find data until 2015 for distance, language, population and GDP (but only the current one). ***I would need data for 2016 and 2017 and also a help of GDP deflator and exportations of Panama to those specific countries in a period of 1994-2017. How do I convert the GDP measured in current dollars with a help of GDP deflator and are there any pages where I already have those values? *** I first used GDP measured in current dollars and with the data I had, the outcome did not turn out okay.
2) Another problem I encountered when doing a hierarchical regression with my first "faulty" database was that, when including fixed effects of the countries, all of the variance was already explained by countries so there was no variance left for my variables of interest to explain. On the other hand, if I do not include it-the model will be biased. What is more, without any fixed effects, in some of the regression blocks I get counter intuitive correaltions/b coefficient values. Therefore, my questions are (apart from the *** sentence marked above):
a) How to solve this problem so my model is not biased and so my variables still explain significant variance of the y (exportations) variable?
b) Which fixed effects should I add to the model? Country, distance, population... which ones?
This is the first time I am using econometrics and fixed effects so your help would mean a lot!
Thank you in advance!
Libraries left right and centre are cancelling print versions of academic journals and discarding their old journal stocks. When challenged, they say don't worry, this information will all be freely available on the Internet. This is not even completely true nowadays, but what is the long-term future? We are entering a Digital Dark Age, not helped by the fact we have our eyes closed as well. Dangers and factors militating against indefinite free storage include -- energy supply security; energy costs (some data centres use as much electricity as a small town); obsolete storage devices and digital formats; missing software; ephemeral recording media; planned obsolescence; unreliable or complacent custodians; malicious hackers, criminals or terrorists; politically or religiously motivated activists. Some of these points are discussed by Roger Highfield in Daily Telegraph Jan 7 2014 p25.
hi all,
how to tell R that the row names is for instance certain column, when exporting files to r using read.csv file function ?
The Microsoft Jet database engine could not find the object
or
Column 'DATE' does not belong to table tbITs
The first error occurs when using a .dbf and the second error occurs when using a text file.
Any help is greatly appreciated.
I need to do an analysis with STRUCTURE using dominant data. If you have an example of this type of database, please contact me.
Most of the publicly available databases give only the basic information like age, gender, mode of infection, etc. regarding the infected patients suffering from CoVID 19. So, can anyone recommend or suggest more specific databases related to image, speech or clinical data of the patients that are meant for open research?
I have been trying to find on a online source for FG-Net aeging database , MOPRH database and YGA database. I can't seem to find none of them available to download . Does anyone know any other ageing database that i can use? Thankyou. I need an ageing database for a my school thesis as I am building a face recognition system and classifying the faces by age . It would be very helpful.
The medical conditions include their heart and respiratory rates, systolic and diastolic pressure, etc. It would be more helpful if the dataset also includes information about usage of medications like NSAIDS and DMARDS, by the patients prior to CoVID 19 infection.
I need database of tomato leaves for testing My algorithms.
I need a dermoscopic image database to test an algorithm for automatic diagnose. In particular I am interested in images with blue-black colors within the lesion.
Can anyone tell me where to find it?
Biomedical
Medical hyperspectral imaging
I need the research paper in which dataset should also be available with that so that i can start my research.
What in your opinion will the applications of the technology of analyzing big information collections in Big Data database systems be developed in the future?
In which areas of industry, science, research, information services, etc., in your opinion, will the applications of technology for the analysis of large collections of information in Big Data database systems be developed in the future?
Please reply
I invite you to the discussion
I described these issues in my publications below:
I invite you to discussion and cooperation.
Best wishes

Currently, I am going to implement the surveying method in one of my research related to business units. Orbis database (of Company information across the globe | BvD) or similar would be useful for me to make a sample according to certain criteria and obtain contacts. My organization does not provide access to the Orbis Database. Maybe someone has access to this database and could provide me with data from it or recommend free alternatives?
Thank you in advance.
I want to perform database operations in distributed database environment. If somebody have idea relating to it. Please share.
Thanks in advance.
To put you in the context, our work consists in realizing a machine learning model which takes a vector with the properties of a farm, includes the weather why not.Then from a database of crops, make a recommendation of the most suitable crop for the soil. Therefore a recognition on the elements which help in this decision is an important part before starting the collection of the data necessary for the model.
Hello,
I wonder what would be the best database that one can use to store and manage a large amount of data (maybe a few hundred gigabytes), in the main basin level, that includes:
1) GIS data:
- raster
- vector
2) Hydrological data:
- time-series of different variables (e.g., rainfall, temperature, humidity, etc.) for different stations.
From the internet, I found that the below databases could be used:
- PostGIS
- MongoDB
Thank you very much
Dears,
I am looking for online free sources of gridded high spatial (1 x 1 km) and temporal (hourly or tri-hourly) resolution weather/climate data to be used in my research. The spatial domain is Europe (or even the world if possible). Please, could you provide me some suggestion on the best available data sources?
Thank you for the support.
Best,
Giorgio
Dear researchers, is it possible to marge two or more RIS file from Scopus database into one RIS file.
Hi there,
Can anyone recommend a Delphi method online tool that I can use?
Mesydel.com has been recommended but you are not able to download it.
I have been recommended http://armstrong.wharton.upenn.edu/delphi2/ as well although the website is less user friendly than I would like.
Any other suggestions?
Thanks.
L
As a part of various financial research initiative, we need different types of dataset. This question is asked for identifying the online sources of financial data both free and paid version. This information can help every researchers to locate Online Directory or data bank for financial and economic analysis.
Please suggest me a standard images database related Age estimation and prediction.
I have gone through many questions but there is no single Discussion thread giving the link, API, JSON for the dataset.
I am making a start so it will be easily searchable by everyone
<COUNTRY NAME> :: <TYPE - JSON/CSV/... etc...> :: <KIND OF DATA - COUNT/NETWORK INFO>
<URL/s>
The " COVID-19 " pandemic has been an unprecedented situation with rapid research and development taking place. There has been a lot of data that is being generated like the total number of patients infected, Active case, Recovered, Deceased.
Data can be obtained from different sources, Example in India " https://www.mohfw.gov.in/ " is the official website of Govt. of India, " https://www.covid19india.org/ " is a website by a group of dedicated volunteers, then we have " https://www.worldometers.info/coronavirus/ " which is worldometer maintained by Dadax
Is there any data validation model or method to verify the data that has been put out.
There is a possibility of overstating or understating the numbers. There can be a discrepancy between sources.
Has there been a solution or discussion about this in the research community, If so what is it?
CTU-UHB Intrapartum CTG database consists of CTG records and clinical information. The data in database has been extracted from the OB TraceVue system to an open format using software. I need the raw data of this database. The database contains .dat and .hea file. But i cann't find a way to extract the raw data in the database. Can anyone help me?
I want to do sample project with SVM .But i can not retrieve data from ZINC database .
I need sample data from this database for classification.My method for classification is SVM.
Thanks
I want to study the relationship between different epigenetic factors and the different types of cancer using existing records in epigenetic and / or oncological databases, but, as a bioinformatician, I have never worked with epigenetics data, so I do not know they are available in what format, they require what type of preprocessing, nor what tools I can use to analyze them.
I would really appreciate if someone gave me some basic indications of how I should start, or if someone recommended me a paper or tutorial about how to work with epigenetic data in cancer bioinformatics.
Hello,
I am looking for databases of quadruplex structures (both C-based i-Motifs and G-based G-quadruplexes (G4s)).
I have only found the databases:
- G4RNA, http://scottgroup.med.usherbrooke.ca/G4RNA/ (RNA G4s)
- G4Hunter supplementary material database (mostly all DNA G4s)
Does anybody else knows if any other source of quadruplex structures exist?
Thanks in advance.
Hello everyone
I am currently a master student in Nevşehir Hacı Bektaş Veli University, Department of Geography, studying Physical Geography. My main areas of expertise are; İn additıon to geographical analysis, plant geography, data mining, map reduce and hadoop systems, land planning, plant taxonomy, I have been working on social media analytics and social media applications and analysis, social media and geography education in scientific and technical terms. But my main focus is on "Data Mining and Plant Geography modeling". I have a technical research article on this topic in my Research gate research account as full text. Study name "Creatıon of Plant Geography Databases Wıth The Map Reduce Modelıng ın The Clusterıng of the Large Geographıcal Data Sets". This study my Map Reduce and Hadoop systems and algorithms addition to data mining, GIS, plant geography research methods - techniques and various international plant databases, taking advantage of biological databases in Turkey and vegetatıon carried out ın the world, plant geography and so on. I tried to develop a new database model with this latest work that will contribute to the fields. In conclusion, I would especially like to listen and take advantage of the ideas and opinions of my colleagues and teachers working on data mining and geography, plant geography or vegetation, especially among geographers. Thanks to everyone who contributed in advance. Sincereley.
In my investigation of the triangular relationship between international sales, international ownership and the riskiness of a stock I am looking for a database that could provide the % of foreign ownership of shares from DAX and MDAX companies between 2003-2019.
I have a list of reference SNPs IDs (rsid) and I need to retrieve the associated diseases ... what are the suitable bioinformatics tools or databases?
I'd like to publish some ideas about taxonomical database. This is possible to do in Biodiversity Data Journal or PeerJ, but I need to pay for an article. Does anybody know similar journals without article processing charge?
For my research, I am trying to find ecological, geographic, hydrologic, social, economic and political spatial/GIS data that is preferably free and easily available. I am especially interested in layers associated with protected areas, distribution of Adivasi populations, Adivasi owned / managed lands, watersheds, dams/roads/mines/embankments, land ownership, or any other data within these fields. I would greatly appreciate some inputs/recommendations/tips as the government data.gov.in website has been difficult to navigate and almost impossible to find data on and the bhuvan website also doesn't allow data downloads.
Hello, can anyone help me?
When I open cif file with file>open , the software (crystal explorer 17.5) shows error processing CIF:
Error in TEXTFILE:open_new_file_for_write ... error opening new file.
I checked the cif file with CCDC checkCIF and it shows correct, opened Mercury generated cif file from my original cif file and gain the same error. I tried cif files from ccdc database but unsuccessfully. Could anyone give me a hand, please?
I am seeking for the minimal inhibitory concentrations of different antimicrobial agents (such as chloramphenicol, ketoconazole, nitrofurantoin, etc.). Maybe there is a database where it should be possible to find MICs against different bacteria and fungi (S. aureus, E.coli, P. aeruginosa, K.pneumoniae, C.albicans, etc.)?
It could be series of images too but not single image for emotion
I am currently trying to search for data on RLFS (R-loop forming sequences) for the human genome on UCSC Table Browser but I cannot find anything. Does anyone know if these data exist?
Currently I am trying to generate it myself using QmRLFS-finder, but it would be great if I could find other sources.
Thank you in advance,
Ana
I've got a big amount of environmental data as independent variables, so I used a PCA to work better with them. But I have problems in extracting/converting the data from de Principal Components to make them work like variables in different GLMs. I'm working with R software. Can anyone help me?
Thank you,
Ferran
We are interested in developing method for predicting siRNA, thus we need a large set of siRNA for developing models. I will highly appreciate if you please suggest best database or databases on siRNA. This will help us in creating large dataset that may cover all experimentally characterize siRNA. Please also suggest best (latest) prediction method on siRNA. Do you think their is possibility for developing better prediction method or this field is already saturated.
Can you help me with some database for neuroscience, for example fMRI database, or database which show underlying mechanisms of the brain, show the connection between brain and behavior, psychiatry database and other things which related to brain, if you were familiar with genetics we have for example Reactome, KEGG, STRING and other database which show lots of pathway and cell connection, I wonder if we have sth like that in neuroscience, a big database which help us to better understand the brain.
Hi everybody!
About 6 months ago I started a data collection thought online questionnaires sent by email. I got almost 2000 people at baseline, with 60% who agreed to be contacted again for the follow-up (online questionnaire, about 10 minutes) in order to investigate associations over time.
In your mind, considering the size of the sample, the online recruitment, what should be the response rate needed I should concretely wish to get to have a "strong" dataset for my analysis?
Under which response rate I should quit the idea to use the longitudinal information?
Is there any?
I would love to hear your idea and your experience with this matter.
Thanks in advance for sharing!
I have XRD spectra of a biomass samples, I need to find out the possible mineral phases present. How do I proceed?
We are looking to implement a web-based lab notebook as well as a tracking system to upload various assay results for several analogs of a parent compound. We will need to keep very close track of lot numbers, dates received, chemists who synthesized them, ect. Does anyone use a service which would be helpful?
Hello, I'm trying to find the specific gene expression in various types of cancers and cancer cell lines, Is there any database in this regard?
Hello all,
I'm trying to test out a predictive model but I'm having a very hard time trying to find hourly precipitation data. I've looked at NOAA and on the new data repository (NCDC) but I can't find hourly or 15 minute interval data past 2014. Am I missing something here? Is there an alternative source I don't know about? If it helps typically this is the weather station I have pulled from in the past: USW00014819.
Any help is appreciated. Thanks.
Can anyone recommend a database that contains raw multispectral images with the different bands and in the same database the NDVI and NDWI index to compare the results obtained? Also, I am looking to see if I use my own multispectral images how I can compare between the vegetation and water of real plants and the NDVI and NDWI indices.
I am looking for databases that contain microRNA-drug interactions. Any suggestions or recommendations?
I've spent a lot of time but still could not find a quality yet public/free data for causal inference (binary treatment; e.g., A=0 or 1, non-DCD vs DCD donations) with survival data. I know it's quite specific requirement, but need one for my master degree thesis.. Of course, one might recommend the one from 'survival' package in R.
But I really want to find good, real (if possible, not too old) data.
For example, the one from this journal looks perfect (but cannot access) .
Can I get some advice or recommendation for such data?
Any comment is appreciated !
I am processing a 16s RNA next gen sequencing data set and trying to compare between my samples the effects on organisms involved in the nitrogen cycle. I am just wondering if there is some sort of database or even a good paper that goes over all of the known organisms in the nitrogen cycle. The more detail the better but i would settle for just a simple list
I am carrying out my postgraduate thesis project on the extractive industry firms and their reporting practices.
I'm interested in automated algae identification using neural networks. I need compose substantial micro photography dataset of algae generas (most significant of Cyanobacteria, Chlorophyta and Bacillariophyta).
Thank you in advance!