Science topic
Metadata - Science topic
Explore the latest questions and answers in Metadata, and find Metadata experts.
Questions related to Metadata
I'm working with a marine ecosystem model and we will have a number of outputs that we want to make publicly available. Ocean and climate modellers often use the CF Metadata Convention (https://cfconventions.org/) to describe their model outputs, providing much needed context for potential users.
I've been looking for an ecological equivalent to the CF convention that I could apply to marine ecosystem model outputs and I came across the Ecological Metadata Language (EML, https://eml.ecoinformatics.org/). Their website states: "The EML project is an open source, community oriented project dedicated to providing a high-quality metadata specification for describing data relevant to diverse disciplines that involve observational research like ecology, earth, and environmental science."
Since it mentions "observational research", I'm unsure if EML could be applied to ecosystem model outputs. Is anyone familiar with EML and could shed some light on whether it would be appropriate to use for model outputs?
Dear ResearchGate Support Team,
I hope this message finds you well. I have recently noticed that all the information related to one of my publications has disappeared from my profile, including the title, authorship, and associated metadata.
Could you please help me understand why this might have happened? I would also appreciate your guidance on how to restore the missing data or whether there is a way for me to recover the publication details on my own.
Thank you very much for your assistance. I look forward to your support.
Best regards
Tran Thi Thu Tra
In April 2025, scientists reported a significant update: baryonic matter long thought to be “missing” in the observable universe has been located — in the form of ionized hydrogen gas stretched across the cosmic web.
While this solves a long-standing observational puzzle, it also opens deeper questions:
What if the reason we couldn’t detect this matter wasn’t just technological — but conceptual?
In our ongoing independent work under VoidPrakash, we explore the idea that time is not a universal axis but a frame- and metadata-dependent function.
In this view, the "moment of observation" is constructed differently depending on motion, context, and information state — making some aspects of the universe effectively temporally occluded until certain parameters align.
Questions for fellow researchers:
- Could a non-linear, layered model of time better explain delays in detectability of certain phenomena?
- Do you see observational lag in cosmology as technological, or partly epistemic?
- What frameworks do you think are needed to explore the link between causality, visibility, and systemic metadata?
📄 Our comparative draft: Trapped in Time: The Cost of a Linear Assumption
📎https://osf.io/preprints/osf/d96u7_v1
Would love to hear your perspectives — especially from those working in astrophysics, philosophy of science, or time perception.

From September 1, 2025, all AI-generated content published in China will have to be clearly labeled.
Four administrative bodies in China - Cyberspace Administration of China, Ministry of Industry and Information Technology, Ministry of Public Security, State Administration of Radio and Television - have announced a regulation on mandatory labeling of AI-generated content. The regulation has 14 articles and specifies who and what is subject to its requirements
👨⚖️ The regulation applies to texts, images, audio, video and virtual scenography
👨⚖️ AI content identification is to be both direct and indirect
👨⚖️ For text - marking with text or symbol at the beginning, middle and end of the text or adding visual markings in the interface of the interactive scene or around the text
👨⚖️ For audio - voice warning - as above
👨⚖️ For image - visible marking indicating AI
👨⚖️ For video - visible marking at the beginning, middle and end of the video
👨⚖️ In all cases, the content metadata must contain information that it was generated by AI
👨⚖️ If the above content has the option to be copied, downloaded or exported, the files must contain appropriate information indicating that it was generated by AI
👨⚖️ Internet service providers are responsible for ensuring that the content they publish is correctly marked as being generated by AI. They do this based on: a) metadata entries; b) a declaration by the content provider if there is no clear description in the metadata; c) a statement by the provider that it is AI-generated content even if there is no mention of this in the metadata or the provider's declaration. 👨⚖️ Information on the methods of marking AI-generated content must be clearly described and available to every user of the service providers
👨⚖️ It is prohibited to remove, modify or hide AI-generated content markings
Content of the Notice on the Issuance of Measures for Identifying Synthetic Content Generated by Artificial Intelligence (Guoxinbantongzi [2025] No. 2) (in Chinese)
All the data conversions (SNP file and metadata file) have been done correctly. The impute command is also executed correctly. Both the "frequency" and "neighbour" commands were tried, but all the missing data simply gets replaced by "NA" and does not allocate an imputed value. Can someone tell what is going wrong?
Note: My species is an oprhan tree species, which does not have any population sequencing data to use as a reference panel, although the genome has been sequenced for a standalone tree. I want to use de novo imputation using the existing data.
I have fluorescent microscopy images in LIF format. There were 3 fluorescent dyes that I imaged with 3 channels. two of the three signals are detected in very similar areas of on my sample, they are difficult to tell apart. Basically I do not remember which image is in the second chanel and which is in the third. I tried looking at the OME metadata but there wasn't anything useful I could find. I would like to know which channel captured which wavelengths. Is there a simple way to get this information?
I want to do microbiome analysis using R, for which I have three folders that contain ASV data, metadata i.e. sample data, and taxonomy data for 56 sample studies. I want to read the data in those folders and create a separate phyloseq object for each sample study, then I have to make a merged phyloseq object. The issue I am facing is that while creating a phyloseq object the ASV table is said to be non-numeric. What should I do in such a case?
I have 10 pre-processed studies for which I have prepared ASV tables, Taxa tables, Metadata, and phylogenetic trees. Now I want to merge these studies and create and single or merged phyloseq object to do further downstream processing.
ASV tables, Taxa tables, Metadata - these are the CSV files while tree is in text format.
# Load required libraries
library(phyloseq)
library("ape")
# Function to load metadata files from a folder
load_metadata_files <- function(folder_path) {
metadata_files <- list.files(path = folder_path, pattern = "\\.csv", full.names = TRUE)
metadata_list <- lapply(metadata_files, read.csv, header = TRUE, row.names = NULL)
return(metadata_list)
}
# Function to load ASV files from a folder
load_asv_files <- function(folder_path) {
asv_files <- list.files(path = folder_path, pattern = "\\.csv", full.names = TRUE)
asv_list <- lapply(asv_files, read.csv, header = TRUE, row.names = 1)
return(asv_list)
}
# Function to load taxonomy files from a folder
load_taxonomy_files <- function(folder_path) {
taxonomy_files <- list.files(path = folder_path, pattern = "\\.csv", full.names = TRUE)
taxonomy_list <- lapply(taxonomy_files, read.csv, header = TRUE, row.names = 1)
return(taxonomy_list)
}
# Function to load phylogenetic tree files from a folder
load_tree_files <- function(folder_path) {
tree_files <- list.files(path = folder_path, pattern = "\\.txt", full.names = TRUE)
trees <- lapply(tree_files, read.tree)
return(trees)
}
# Specify folder paths
metadata_folder <- "C:/Users/Saesha Verma/OneDrive/Desktop/Metadata_SB"
asv_folder <- "C:/Users/Saesha Verma/OneDrive/Desktop/ASV_SB"
taxonomy_folder <- "C:/Users/Saesha Verma/OneDrive/Desktop/Taxa_SB"
tree_folder <- "C:/Users/Saesha Verma/OneDrive/Desktop/Tree_SB"
# Load metadata, ASV, and taxonomy files
metadata_list <- load_metadata_files(metadata_folder)
asv_list <- load_asv_files(asv_folder)
taxonomy_list <- load_taxonomy_files(taxonomy_folder)
tree_list <- load_tree_files(tree_folder)
create_phyloseq <- function(asv_list, taxonomy_list, metadata_list, tree_list) {
# Merge ASV tables based on sample IDs
merged_asv <- do.call(rbind, asv_list)
# Combine taxonomy tables into a single tax_table
tax_table <- do.call(rbind, taxonomy_list)
# Combine metadata tables into a single sample_data object
sample_data <- do.call(rbind, metadata_list)
# Merge phylogenetic trees
merged_tree <- lapply(tree_list, function(x) list(phylo(x)))
# Create phyloseq object
ps <- phyloseq(otu_table(merged_asv, taxa_are_rows = TRUE),
tax_table = tax_table,
sample_data = sample_data,
phy_tree = merged_tree)
return(ps)
}
ps <- create_phyloseq(asv_list, taxonomy_list, metadata_list, tree_list)
I am using this code but I encounter error :
ps <- create_phyloseq(asv_list, taxonomy_list, metadata_list, tree_list)
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
I am a co-author of the article:
DOI: 10.1051/0004-6361/202142242
This article is presented on the RG by the research item:
in which not all authors are listed (only 21 among 36).
Because of this mistake in the metadata, 15 researchers (including me) cannot verify their authorship. How can I add missing authors to this research item?
Please I am having difficulty opening Landsat images in ENVI 5.1 using File>open as>Landsat> Geotiff with Metadata. It keeps flagging the following error message "Unable to read file as the specified filetype: LANDSAT_GEOTIFF_META"
Has anyone encountered this issue and how was it resolved?
My professor asked me to find a software solution to deal with scRNA data analyses (preprocessing and analyzing data). The department is going to have VIZGEN and COSMX machine coming next quarter. 3-4 labs depend on 1 bioinformatician guy is really annoying and inconvenient.
There're some companies I am looking at:
- BioTuring
- Rosalind
- Cellxgene
Most of the time, we want to compare between metadata categories to generate insights, and also apply gene set / pathway analysis on those. Any suggestion on that cause I am truly depressed rn :( ?
We want to look up relevant scholarly articles that have been done in many nations in Africa; the study aims to covers the entire continent. In the continent's nations, it is believed that a wide range of investigations have been carried out on the specific topic so far.
In the search engine (like Google Scholar), how could we search to get all the relevant scientific works?
(a) Country "X" + "Keywords" in the search bar? If this is the approach to be followed, we'll start looking for country by country for all African nations. This may take a longer time to search all the relevant articles in the continent.
(b) "Keywords" + "Africa" in the search bar? If we adopt this strategy, the word "Africa" might not appear in the article’s "Title" or "Keywords." Hence, we may miss relevant articles to be included for the metadata development?
(c) Only "Keywords" in the search bar? This strategy brings all studies worldwide. Then, we need to further screen studies which belongs to Africa?
(d) ???
Any suggestion please?
Thanks!
In the metadata file of satellite images, image acquisition time is there. In which time zone format they are giving the data?
- Are these files only used by Petrel?
- What are SGY, ZGY, and SGY.VAL files?
- What is the difference between file size, speed, and metadata of these files?
- What are the use cases of these files?
Is there any public dataset of microservices DAGs in production metadata? e.g. a dataset containing the shape of the microservice graph?
The camera trap photos show the ambient temperature in the image, but it looks that the info it's not in the EXIF data
Dear all,
We are current doing a research which involves developing an internal DSL using metaprogramming in Ruby. It's not a clear topic to us and I would appreciate any good references/examples about that, please.
Thanks.
📢 IEA EBC Annex#81 calls for datasets of B2G services:
The Annex#81 subtask C3 team is coordinating a survey to collect open datasets for building-to-grid services (B2G) and would like to kindly ask you to contribute by providing a description and link to your data using this Google Form (https://docs.google.com/forms/d/e/1FAIpQLSdqV6MxY0DiJUar9kdkXypXq7EhuhxLP9OzHaN7WjZ9xlFaOg/viewform).
🔖 This survey aims to collect data (time series and metadata) from buildings performing demand-response, demand-side management or energy flexibility. The dataset may be from existing, simulated, or semi-simulated hardware-in-the-loop buildings.
The collected datasets will be used for:
➡️ Gathering use cases and assessing typical DR strategies, buildings types, energy systems and data requirements,
➡️ Testing energy flexibility KPIs for the review activity of C3 on existing KPIs for B2G services,
➡️ And, possibly, comparing the existing solutions against “grid challenges” from the proposed C3 web-based showcase platform.
⌛️This survey should take approximately 20-30 minutes to complete.
Thank you in advance for your valuable contributions!
Don't hesitate to share that post.
Feel free to reach us (Hicham Johra: hj@build.aau.dk; Flávia de Andrade Pereira: flavia.deandradepereira@ucdconnect.ie) if you have any questions.
#Annex81 #survey #dataset #B2G #KPIs #DR #DSM #datadriven #smartbuildings #flexibility #metadata #datarequirements
Given a satellite image without its metadata, can you tell the spatial resolution of the image using some analytical tools like python for instance?
I'm working on an update to our previous global geochemical database. At the moment, it contains a little over one million geochemical analyses. It contains some basic geochronology data, crystallization dates for igneous rocks and depositional dates for sedimentary rocks. The database differs from GEOROC and EarthChem, in that it includes some interpretive metadata and estimates of geophysical properties derived from the bulk chemistry. I'd like to expand these capabilities going forward.
What would you like to see added or improved?
Here's a link to the previous version:
I am looking for details in CXRs pertaining to the angle of rotation, missing lung portions (like the base and apex), under/over exposure, overlying anatomy (like Chin) and other details on the data quality. Are there any publicly available Chest X-ray collections that include these metadata?
I am trying to access the genomic (methylation, RNA and miRNA expression) along with its corresponding metadata (clinical and demographic) for this cohort TCGA Pan-Cancer (PANCAN)
(https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443). I tried both Python and R, I managed to get access to the cohort but I did not know how to load the datasets for expression or clinical data. I tried using XenaUCSTools package in R and XenaPython in Python.
Is there other packages that can load the specific datasets of gene expression or metadata? And is it possible to access this dataset through GEO?
Is there any metadata analysis for tropical and temperate regions?
[This classic Lenski paper][1] computes the effective population size of an evolving *E. coli* population subjected to daily population bottlenecks as $N_e = N_0 * g$, where $N_e$ is the effective population size, $N_0$ is the population size directly after the bottleneck and $g$ is the number of generations between bottlenecks.
Unfortunately, the formula was not derived in the referenced paper and the referenced articles appear to not describe the formula directly, but only provide the fundamentals for deriving it.
Can someone explain how this formula comes about?
Do there exist open data online sites where one can submit nematode micrographs, VCE videos, and image metadata? A content management system like Drupal could work. The level of effort and maintenance of such a site would be high.
Extending NemSyst (https://nemys.ugent.be/) with a citizen science add-on that allows users to submit observations?
Many of the existing sites are not quite there yet as far as nematodes go. iNaturalist.org is close -- https://www.inaturalist.org/. May be able to make it work?
What is main reference book to learn basics of metadata analysis ?
I am trying to install the "FragBuilder" module in python in windows 10 in python 2.7 via conda. However, It fails again and again. The link which gives the command to run in windows is here: https://anaconda.org/bioconda/fragbuilder When I try to run it, it shows the following error message,
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible
solve.
Solving environment: ...working... failed with repodata from current_repodata.json, will
retry with next repodata source.
Collecting package metadata (repodata.json): .working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible
solve.
Solving environment: ...working...
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
Note: you may need to restart the kernel to use updated packages.
Building graph of deps: 0%| | 0/5 [00:00<?, ?it/s]
Examining @/win-64::__archspec==1=x86_64: 0%| | 0/5 [00:00<?, ?it/s]
Examining python=2.7: 20%|## | 1/5 [00:00<?, ?it/s]
Examining fragbuilder: 40%|#### | 2/5 [00:00<00:00, 11.79it/s]
Examining fragbuilder: 60%|###### | 3/5 [00:00<00:00, 17.69it/s]
Examining @/win-64::__cuda==10.2=0: 60%|###### | 3/5 [00:00<00:00, 17.69it/s]
Examining @/win-64::__win==0=0: 80%|######## | 4/5 [00:00<00:00, 17.69it/s]
Determining conflicts: 0%| | 0/5 [00:00<?, ?it/s]
Examining conflict for fragbuilder python: 0%| | 0/5 [00:00<?, ?it/s]
UnsatisfiableError: The following specifications were found to be incompatible with each
other:
Output in format: Requested package -> Available versions
Can someone help me fix this issue?
Edit: I checked the github document of this module and then installed fragbuilder module in python 2.7 as it cannot be supported in python 3.8 version.
pythoninstallationfailed-installation
According to RDF* specification is it possible to have the same tripla pattern with differents qualifiers?
Example
( << s1, p1, o1>>, q1, v1 )
( << s1, p1, o1>>, q1, v2 )
( <<SPinera, president, Chile>>, start, 2010)
( <<SPinera, president, Chile>>, end, 2014)
( <<SPinera, president, Chile>>, start, 2018)
RDF* definition
An RDF* triple is a 3-tuple that is defined recursively as follows:
1. Any RDF triple t ∈ (I∪B) ×I×(I∪B∪L) is an RDF* triple; and
2. Given RDF* triples t and t′, and RDF terms s ∈ (I ∪ B), p ∈ I and o ∈ (I ∪ B ∪ L), then the tuples (t,p,o), (s,p,t) and (t,p,t′) are RDF* triples.
Reference for RDF* definition
Hartig, Olaf. “Foundations of RDF⋆ and SPARQL⋆ (An Alternative Approach to Statement-Level Metadata in RDF).” AMW (2017).
The students of my Information Security Management course are currently conducting a survey on the topic of Privacy and the use of Metadata. Please support us in filling out the survey: https://www.surveymonkey.com/r/5QQFVLK
Thank you!
Hi,
I've submitted some sequences with incorrect metadata. Already sent an email to NCBI. However, it would be way faster if I could just delete the records and resubmit them.
Thank you very much for your time and patience.
Accounting traditionally is presented as describing efficiently flux (what comes in, and goes out) and stock (what is held, at a given time), and as debit and credit. It is also about matching the terms of an exchange.
How can we move the model beyond the basic number-based description, into more data-rich (including metadata, descriptors, etc) frameworks, while benefiting from the deep and long experience of accounting over human history?
With Matrices of sets [1], a first endeavour was made to describe objects rather than numbers attached to them (price, quantity, measurements and features).
With Matrices of L-sets [2] we are going one step further, distinguishing actual assets (as classical sets) and wish lists, orders, needs, requirements which are not yet owned or available. We show how an operational and computable framework can be obtained with such objects.
References:
[1]
Presentation Matrices of Sets, complete tutorial with use cases
[2]
Preprint Matrices of L-sets -Meaning and use
Is there any standardized open format for storing mobile mapping data?
- Images
- Exteroiur and Interiour Orientation
- Trajectory
- Scandata
- Metadata
Due to the huge amunt of data and single files, a format optimized for data transfer tasks would be useful as well.
Do you know such Formats?
Imagine that I am interested in re-analysing a .csv document publicly available on the OSF, e.g., https://osf.io/gdr4q/ Is there anyway I could directly access that file from R using something analogous to data <- read.csv2("https://osf.io/gdr4q/", header=TRUE, na.strings="NA")?
I am recently studying X-ray powder diffraction measurements made with a D4 Endeavor. I can load the metadata from Profex-BGMN but I don't understand the meaning of the instrument metadata variables (and units), so I cannot prepare the instrument contribution to run the refinement. Any help from Bruker users?
Thanks!!
Stefano
I have PE data from Miseq (Aalbo_R1.fastq.gz and Aalbo_R2.fastq.gz).
How do I write a valid metadata file for this to work in QIIME?.
Sample is Aedes albopictus collected from city environment.
I want to convert my Image stack file (OME-TIFF image stacks) files to separate image files (three greyscale TIFFs plus metadata text files).
Is it possible for you to help me with this?
Who also uses eLabFTW for documentation of experiments and analysis? In what research area and what experience do you have?
GSAID renders most of the resources for SARS CoV-2. I need to download and analyze the metadata of the sequences(only those which have patient status: dead/ deceased ) from the USA. But I need to check, select and download every sequence (whether it comes from a dead body or not). How to do that in a bulk not manually?
I've successfully uploaded my metadata file and assembled, shotgun metagenomic dataet to my MG-RAST inbox. The metadata passed the QC-like step, my sequwnce data uploaded just fine and I could happily move on to the submission step. Everything works beautifully until I get to the tab where I submit my sequence data, which is there but in red text and I can't select it.
Has anyone else had this issue and knows what the problem is?
I have deleted and re-uploaded both the sequence and metadata files but with no success. Any help/advice would be hugely appreciated!!
Thanks in advance.
I am currently working on the metadata of coral reef-associated bacterial communities with R. I would like to display a pie chart with two layers, including the inner layer(Phylum) and outer(Class). I don't know how to write a script to visualize this data. Can anybody help me with this? Thank you very much!
Hello,
My Landsat 8 (Collection 2 Level 2) 7-band layer stack has a cyan tint across the entire image. Do I need to apply any kind of correction? Level 2 Landsat 8 imagery should be corrected to surface reflectance, but how can I read the metadata to be certain about what level processing the data had?
How can I read the reflectance values in ENVI to see if my data is ready to use?
Thank you!
I am new to ENVI and attempting to do a radiometric calibration on Sentinel 2 L1C images. The metadata in the ENVI header shows 0.000 for the gain and offset values of all of the bands. I believe this is why I am getting "no data" when I attempt the radiometric calibration.
Where can I find or calculate the gain and offset values to input them using the "Apply Gain and Offset" tool. I haven't found this information in the metadata, but I probably don't know what to look for.
Thank you!
Developing a metadata on quantitative effects estimation, what tool could be more friendly and accessible for metadata handling?
I´m working on a project proposal that will develop a historical event-based model and a historical SDI (snapshot model). I´m wondering which metadata standard suits better for both. The study case is in the Iberian Peninsula.
I need to perform ligand-receptor interactions map for the data of bulk RNA sequencing (mouse). In all methods which I found they want to have matrix with columns of gene symbol and mean expression values for each cell type. I have only tsv files with metadata and counts. Do you know how to get this from the data I have. Is there any R library/protocol/tutorial for that? Which method you suggest for obtaining receptor-ligand Interactome for bulk RNA?
Here is how my metadata looks like:
id nCount_RNA nFeature_RNA PercentMito ERCCCounts PercentERCC Animal Plate
X11_E1 569589 11505 0.00331115945006 20 3.51E-05 11 11 X11A10
.......
Birthdate Gender Organ CellType RowID ColID
old Female BM GMP E 1
.......
Counts:
gene X11_E1 X11A10 X11A12 X11A3 X11A5 ........
Gnai3 23 4 22 25 94 ..........
.......
How can I use R to extract the time interval between visits of different species from camera trap images?
For example, Species A arrives at 07:00 (6 photos, each 1 min apart)
Species B arrives at 07:45 (so a 45 min time interval, or more precisely a 39 min time interval).
I have 10's of thousands of images, and I have used camtrapR to extract the record table, but the deltatime just tells me the time interval between photos of the same species same visit, not inter-species. I want to investigate the effect of species A's presence on species B.
Any advice?
Hello,
Is there any shareable MOOC metadata dataset that contains details about the course's different sections? Or any open API that can extract those data?
Many thanks in advance.
Hello Research Gate community,
I have a question about my interpretation of capscale() in the vegan R pachage and how to assess the variance explained by the interaction effect.
Imagine a significant model like this: Var1 + Var2 + Var1:Var2
> RsquareAdj(capscale(otu_table ~ Var1 + Var2 + Var1:Var2, metadata, distance = "bray"))$adj.r.squared
[1] 0.281792
Then I can obtain the variance of the main factors
> RsquareAdj(capscale(otu_table ~ Var1 , metadata, distance = "bray"))$adj.r.squared
[1] 0.1270805
> RsquareAdj(capscale(otu_table ~ Var2, metadata, distance = "bray"))$adj.r.squared
[1] 0.09308548
Then, is this the right way to calculate the Adj.R2 for the interaction?
> RsquareAdj(capscale(otu_table~ Var1:Var2 + Condition(Var1) + Condition(Var2), metadata, distance = "bray"))$adj.r.squared
[1] 0.05174793
However, if I sum the variances altogether I do not get the variance explained by the full model
0.09308548 + 0.1270805 + 0.05174793 = 0.2719139
I looked online but I could not find any decent explanation of this.
Thank you for your help!
Nico
I need a simple data catalog to manage some datasets and to visualize metadata about them (descriptive statistics, variables’ description, …). Datasets are CSV files in several folders and some tables in an SQLite database, everything stored in a Mac desktop. What solution would you recommend?
The two mains I know are Ckan and Amundsen, but I find them difficult to setup and use. Do you know any simple software to manage this kind of task?
Thanks,
Hi, I am quite a newbie with python, and I need to run some text mining analysis on 100+ literary texts in German, which I have stored as individual txt files in a folder. They are with the scheme author_title_date (for example "schnitzler_else_1924.txt").
I was thinking of using the python package nltk and/or spaCy, and maybe the Stanford NER, as I need to analyse sentiments in the different texts and to identify specific locations as well as the sentiments in relation to such locations.
I am stuck on a very preliminary passage though: how do I import the all the text files from the folder to a single corpus/vector corpus that retains the metadata in the title? I could relatively easily produce that in R with TM, but I can't find a way to do it in python. Thanks!
I am trying to extract some information (metadata) from GenBank using the R package "rentrez" and the example I found here https://ajrominger.github.io/2018/05/21/gettingDNA.html. Specifically, for my group of interest, I search for all records that have geographical coordinates and then want to extract data about the accession number, taxon, sequenced locus, country, lat_long, and collection date. As an output, I want a csv file with the data for each record in a separate row. I am attaching the script I have constructed and it seems it can do the job but at some point, rows get muddled with data from different records overlapping the neighbouring rows. For example, from the 157 records that rentrez retrieves from NCBI the first 109 records in the resulting file look like what I want to achieve but the rest is a total mess. I suspect this happens because the XML contents differ a bit between the GenBank entries but cannot figure out how to fix the problem. Any help would be greatly appreciated because I am a newbie with R and figuring out each step takes a lot of time. Thanks in advance!
I am working with Landsat 5 TM images. Unlike previous USGS landsat products that used to contain metadata in the TXT format, the new products are in the XML format and they are atmospherically corrected scenes. So I can't use ENVI 5 to do the radiometric corrections, as ENVI only supports TXT format of metadata.
As I try to use the NDWI = (Band2 - Band4)/(Band2 + Band4) calculation, I get a range of -6.2 to 6.1 instead of -1 to +1. This might be due to high spectral reflectance of the atmospherically corrected image.
How to resolve this issue?
I am Looking for wave directions data in the southern coast of the Caspian sea, to draw wave-rose at Anzali port, Amirabad port, Babolsar, Nowshahr port etc. I already have got the Iranian National Institute for Oceanography and Atmospheric Science 's metadata, but was not helpful. Is there any other online provider?
(time history data at one point at an area around each port is adequate)
I have used Fitz so far to extract things like Font, Text size, etc in a pdf file. What are some other good tools/ libraries that can extract some meaningful metadata about the contents of the pdf?
Some libraries also store the metadata of complete pdf rather than the content. ( Not looking for those) .
Any techniques are also helpful for the same.
Is there a library or technique to access this information. Also, can we target metadata information in some specific part of the pdf document?
Nowadays, the Spanish Cybermetrics Lab ranks about 25 thousand universities and about 2 thousand OA-Repositories.
The third partial indicator “openness” of Webometric University Ranking depends on the filling of university OA-Repositories since Google Scholar finds “rich files” in these OA-Repositories.
This, in return is attributed to Google Scholar and OA-Repositories which operates as part of a single Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The above mentioned case history can be emphasized by the link http://www.bsu.edu.ru/en/news/detail.php?ID=318290 on Webometric Ranking of Belgorod State University (Russia).
I’m proposing that the university management should pay much attention on the above mentioned effect. In this context, how is the situation in other universities?
Digital Object Identifier is a unique number assigned to scholarly articles to ensure their International Standardization and for easier identification of published articles even if the metadata URL is changed. Should all publishers register and assign DOI to their articles? Should DOI assignment be used as one of the key criteria in choosing a good journal by authors?
Hi. I activated qiime 2 2019, prepared a folder named '6 sample file' to try data import which has metadata, manifest, and 6 data sequences as in the picture. I received this error. maybe as manifest and metadata are not qza (I read the tutorial but could not find how to convert into qza. Thanks a lot
(1/1?) no such option: -q
(qiime2-2019.7) a-PC:~ hebahussein$
(qiime2-2019.7) a-PC:~ hebahussein$ qiime tools import \
> --type 'SampleData[PairedEndSequencesWithQuality]' \
> --input-path 6 sample import \
> --output-path paired-end-demux.qza \
> --input-format PairedEndFastqManifestPhred64V2

Dear colleagues,
Based on our successful experience in establishing the SWIG database on infiltration data, we decided to initiate another project to collect soil hydraulic properties and their metadata. Currently available databases that provide estimates of the soil water retention curve and the hydraulic conductivity curve use pedotransfer functions that do not include information about the structural properties. We are therefore seeking original soil hydraulic data (soil water retention curve, hydraulic conductivity curve, and air permeability) obtained on undisturbed soil samples and their corresponding soil properties (e.g. texture, carbon content, bulk density, taxonomic and land-use information) as well as quantitative and/or qualitative information on the soil structure. These structural data may include:
. Taxonomic information (Blocky, granular, or etc.)
. Ksat on undisturbed soils
. Pore size data (obtained e.g. from nuclear magnetic resonance (NMR) relaxometry)
. Aggregates stability
. Geometric mean or mean weight diameter (GMD or MWD) of aggregates
. Fractal dimension .
. Mechanical resistance or etc.
Any metadata describing the experimental data in more detail are welcomed. Data may stem from lab scale, plot, field and catchment scale experiments. We are planning to establish a database of these data that we will submit to a data journal (e.g. Earth System Data Science or any other suitable journal). In order to honor your support, we would like to include you as co-author on the manuscript. We hope that you will contribute to this initiative and welcome any questions you might have.
Please use the enclosed template file (excel file) to share data with us.
We would also appreciate if you could kindly distribute this request to colleagues that might be interested in contributing to this effort.
Our published article in SWIG database is provided in following link for those people who we have had no luck to have their supports for that project.
Best regards,
Mehdi Rahmati (mehdirmti@gmail.com)
Rainer Horn (rhorn@soils.uni-kiel.de )
Harry Vereecken (h.vereecken@fz-juelich.de )
Lutz Weihermüller (l.weihermueller@fz-juelich.de )
Hi everybody, I have problems in opening a .xdce file. The file is supposed to contain several series of images taken with the microscope InCell 2200. I tried to open the file by using either ImageJ+Bioformats package and Fiji, but I was not able to visualize the images. I mean, when I try to open the file it asks me which serie I want to open, it opens the file, I can see the metadata, but the images are black. Trying to modify the brightness and contrast doesn't work, it's like there is no fluorescence. This warning appears all the times I try to open the file: "[WARN] Unknown ExperimentType value 'Fluorescence' will be stored as "Other". I don't understand whether the file is damaged or it's me I don't open the image properly. I fear an error occurred during the trasfering of the folder containing the images from the computer of the microscope and my hard disk, because within the folder there is only one file (it is supposed to containing several series of images) and also because my computer doesn't seem to recognize the file xdce (the file has no icons, like when the computer doesn't know which app to use to open a file). Reading the ImageJ guide I understood this file extension should be supported by ImageJ and Fiji. Thank you in advance!
Hello,
I have several years of data from various projects that I'd like to build into a database, but don't know how. Microsoft Excel is useful to tabulate the individual dataset but not to group it all together. My data consists of benthic species lists and abundances - with metadata such as client, project number, project name, station names, station coordinates, and more.
Thanks in advance,
Shazara
I am watching qiime importing metadata tutorial. I downloaded the barcode and sequence file using url and tried to open in qiime but they do not work as follow:
(micro) a-PC:desktop hebahussein$ cd emp-single-end-sequences
(micro) a-PC:emp-single-end-sequences hebahussein$ qiime tools import \
> --type EMPSingleEndSequences \
> --input-path emp-single-end-sequences \
> --output-path emp-single-end-sequences.qza
-bash: qiime: command not found
(micro) a-PC:emp-single-end-sequences hebahussein$
I have raw images without any associated metadata. I believe ¨exposure time¨ is essential for performing HDR composition, is there anyway or tool to estimate the exposure time? or is there any algorithm that do the composition without this data?
Sir i want extract all metadata from a digital audio file for authentication purpose.
I faced with a problem in Qiime2 when I was demultiplexing my data. The error is below:
There was an issue with loading the file Barcodemetadate.txt as metadata:
Metadata file path doesn’t exist, or the path points to something other than a file. Please check that the path exists, has read permissions, and points to a regular file (not a directory): Barcodemetadate.txt
There may be more errors present in the metadata file. To get a full report, sample/feature metadata files can be validated with Keemei: https://keemei.qiime2.org
Find details on QIIME 2 metadata requirements here: https://docs.qiime2.org/2019.1/tutorials/metadata/
I have already checked my metadata file with Keemei. It sent me ‘Good Job’! i did not have any error on Keemei!
My data have barcodes which they must be demultiplexed then denoised. I used the only option in the tutorial- qiime demux emp-paired.
My data is Casava 1.8, already imported to .qza artifact. The next step is multiplexing.
Metadata is ready and already cheeked by Keemei. It reported no errors.
The command used I is:
qiime demux emp-paired \
--m-barcodes-file MyMetadataFileName.txt \
--m-barcodes-column Columename \
--i-seqs Artifcatname.qza \
--o-per-sample-sequences demux.qza \
--p-rev-comp-mapping-barcodes
Who can send me a relevant command to demultiplex my data? Or Somebody has an any Idea?
Thanks
In data science there are three basic trends as far as my study (may be i am not correct). These trends include metadata solutions, data patterns recognition solutions and classifying data into classes.
My question is what can be the future possible directions in the data science.
I am looking for an R package and Python library for metadata development. Any suggestions will be highly appreciated. Below is more information about what I want.
I am working on a data project that requires me to create and store metadata either directly inside the data-frame or externally to the data frame, but the important thing is that users must be able to update the metadata as more discoveries about the datasets are uncovered. The metadata must be able to store all sorts of information such as the structure of the data, properties of the datasets, size, creator, the creation time of each dataset with the hope that the metadata can be explored and visualised separately in the future.
Many thanks.
This project is looking for soil organic carbon data and changes in China.
We are seeking your support in collecting SOC data to predict and map SOC. What we would need are the SOC, bulk density, and soil texture contents for entire China or any provinces or any district in China.
Any metadata in more detail are welcomed.
Thanks in advance.
#
I am trying to read metadata file (*MTL.TXT) of Landsat 8 in python 2.7.12. I have written few functions. I want to return only specific values back or may be store them in global variable so that i can access them outside the function. Inside the function i am able to access those specific values. My aim is to convert DN values to reflectance.
Since CIDOC-CRM was created for integrating library, archive and museum resources, why it is not used directly by Europeana and DPLA. Why Europeana and DPLA bother to create new metadata models?
Can we say that the metadata related to the data, and Context data related to objects.
and also, Is there a relation between context data context aware.
What is the meaning of inbound and outbound metadata w.r t data filtering?
I am trying to perform Metadata analyses with the METAFOR package (R) and I would like to learn more about the package potential. I have doubts about the model calculations and I would like to verify my analyses with a real example.
Dear All,
If you are somehow familiar with the use of ontologies, please consider answering this survey.
In the context of the recent reconfiguration of the RDA Vocabulary Semantic Services Interest Group [1], we are leading a task group focused on ontology metadata.
The goal of this task group is to review the current state related to metadata description of semantic resources in general (ontologies, vocabularies, thesauri and terminologies); and discuss recommendations and best practices in terms of metadata standards for ontologies.
To give insights to the group and drive our work, we have set up a survey on the topic and would really appreciate your participation.
The survey is anonymous and uses GForm. It should take about 10-15 minutes.
Please take a moment to send your feedback.
Thank you for your help
Clement Jonquet, Biswanath Dutta, Anne Toulet and Barbara Magana
Dear all,
I can request historical NOAA weather station data before 1997 for about 260 U.S. stations. However, there are around 6000 land-based U.S. stations available through MesoWest and the metadata tells me that some of these stations started before 1997. MesoWest archived data beginning with 1997. Is there a way to access data before 1997 for these MesoWest stations, eventually through a third data provider?
I have thousands of images that I want to construct a database around (in excel), beginning with info about them available in their metadata (e.g. date, camera name) which is viewable in windows when browsing files and their properties. I don't want to enter these individually. My second goal is to automatically create hyperlinks to their file location on my machine where a link in the database could be clicked and the file launched with a default program. Is this easily possible without purchasing specialized software to do this? Perhaps some macros written in excel could do this?
what is the method that can be done to evaluate a metadata concept. Are there any tools that can be used? Are there any papers discussing metadata evaluation
I would like to add some data to a SPSS file I want to make public. Is there any option, how to include details about the observations, acronyms etc.?
Scholars I am looking for a suitable metadata for digital heritage objects, text, images and AV.
Your help will be greatly appreciated.
I am working on the topic "Utility Enhancement for Textual Document redaction and Sanitization". I have noted in the literature of de-identification of the medical document that Privacy models perform unnecessary sanitization by sanitizing the negated assertions, (“AIDS negative”). I want to exclude the negated assertions before sanitizing the medical document, which will improve the utility of document. I want to know which dataset will be appropriate for my work. I tried to use the 2010 i2b2 dataset but I could not find the metadata of that dataset. The 2014 i2b2 de-identification Challenge Task 1 consists of 1304 medical records with respect to 296 patients, of which 790 records (178 patients) are used for training, and the remaining 514 records (118 patients) for testing. The medical records are a fully annotated gold standard set of clinical narratives. The PHI categories are grouped into seven main categories with 25 associated sub-categories. Distributions of PHI categories in the training and test corpora are known as in the test corpora 764 age, 4980 dates, hospitals 875, etc. I want to know the same above information for 2010 i2b2 dataset that I could not find yet.
Thank you.
I am doing my dissertation about metadata using video but I don't know how to gather much information about the video statistic. Any thought or suggestion? I hope i could see which part of my video that people most see to capture the pattern of my video.
Sentinel-2 imagery granules downloaded from AWS archives (http://sentinel-pds.s3-website.eu-central-1.amazonaws.com) have a little different data structure than the original granule sub-folders of ESA SAFE product. Consequently, it seems the SNAP tool is not able to import the data as one product with metadata - and I need to process the data to level 2 (atmospherically corrected ground reflectance). Is there a way? Or is there a site which allows downloading individual granules in original SAFE data structure?
(question originally asked on stackexchange)
We are trying to implement a web based catalog for our photo library. One important thing is the compatibility among the software and the IPTC metadata
The issue is that my target area from different days doesn't overlap each other. When I use UTM coordinates to reference instead of lat-lon values, the results are even worse.
Here I describe my methodology which might help to debug the issue.
I am trying to use the Landsat 8 data to track temporal changes on land surface (for a small area about 15*15 km and about 512*512 pixels of Landsat 8 image). The satellite goes over the same area (lets call it target) on earth after 16 days approximately. Since the position covered by the satellite changes slightly after 16 days, there is a need to reference the different day satellite image in order to overlay images from different dates.
To locate target on different temporal layers of Landsat, I use the width and height of images and also the coordinates(UTM) mentioned in the metadata file. Example metadata file (https://s3-us-west-2.amazonaws.com/landsat-pds/L8/043/024/LC80430242013259LGN00/LC80430242013259LGN00_MTL.txt)
The X and Y UTM coordinates change linearly with the pixels of the image (my assumption). So, I use formulas like these to obtain the pixels i, j where the boundaries of my target start.
i = ((width / (maxX - minX) * (X-minX)));
j = ((height / (minY - maxY) * (Y-maxY)));
Here minX, minY, maxX and maxY are the boundaries of the image in UTM coordinates. width and height represent the number of columns and rows in image to represent pixels. X & Y represent the absolute coordinates (UTM) where the upper left boundary of the target is located. So, i and j represent the pixel coordinates where the upper left boundary of the target starts. I also tried similar formulas with lat-lon coordinates.
I create image of my target area (512*512 pixels) by taking pixels in the area (i,i+512) and (j,j+512).
Now, the target area from different is approximately overlayed well when I use lat-lon coordinates. The target area is quite displaced when I use UTM
coordinates. I expected the UTM coordinates to give accurate results and it doesn't make sense why the overlays are not coming out to be accurate.
How to reproject Kompsat 2 Lever 1R to UTM zone 36N?
There is an option in ENVI 5.3 to open the bundle as Kompsat, it recognises bands, but in strange coordinate system. So, is it possible to use somehow metadata for geocoding image?
any research need library for categorizing and citing other research, many tools available but I prefer the open source Zotero, which integrated with FireFox and word and open office.
it's easy to add items to zotero and categorize them using firefox, then use them in documents at word or OOO
the sharing is also important, using zotero server you can backup ur library remotely, sync between multi devices, and share part of it inside groups libraray
unfortunately zotero don't support social research sites like research gate or acadymia because no metadata linked to research items.
also groups don't have discussion tools, it's only shared library.
what is yours
I am trying to get morphometric data on Ouachita map turtles from a particular river system and going from website to website is inefficient if there is a one-stop-shopping alternative online.
I want to expand information about the geospatial standards listed on https://www.fgdc.gov/standards/list. ISO and ANSI charge for their standards. I would like to include information that there is a cost applied to ISO and ANSI standards; however, it's not apparent which, if any, Dublin Core metadata elements (or ISO TC 211 geospatial metadata elements) apply to cost of a document.
Hi all,
We have many thousand of empty pictures triggered by grass/tree movement in our database and we are simply looking for a way to filter them out.
We don't need software that can identify species or deal with metadata, just something that can detect empty pictures for us. It would safe about 95% of our time...
Thanks!
Stephanie
I'm curious to get time series data at a certain location. If anyone has any references for a paper where they made a dataset available or a data repository I can obtain data from I'd be very grateful!
I am using the landsat tree cover continuous fields dataset from Hansen et al. (2013) to calculate the mean forest cover within HUC 12 watershed boundaries. However I can't seem to figure out how to calculate the mean. The attribute table for the data does not contain the pixel values (0-100) for percent canopy cover. I can only find these by identifying a certain pixel or by looking in the metadata. I tried copying the metadata into excel but there are way too many zeros. There has to be a way to do this.
See the attached images. The one labeled attribute table shows that the pixel values do not show up in the attribute table. The next image labeled PXVL shows that when you identify a pixel the value shows up. The image attached labeled metadata shows the metadata for the image and that there are a lot of zeroes that surround the outline of the image in this dataframe. Is there a way to extract these values to calculate the mean?
Any help will be much appreciated!
I would like to make an analysis with QGIS with the tools such as geo-statistics, geo-referencing, and digitizing with this incomplete and rough sketches of the data. However, since the client does not have the proper information about the local coodinates, or any shape file that could contribute with the GIS, I have found out that by using Google Earth Pro, I could find a location which I am confident about that this would be the location in the sketch drawn by Mr. Mostafa El-Sayed, the surveying engineer, in the El Bahariya.

My aim is to obtain valuable info from Prospectus (document that describes a financial security). I.e., I need to build a metadata repository about financial securities by extracting info from documents that describe them.
A digital badge is a visual representation that signifies a specific achievement with detailed metadata attached. A popular comparison relates digital badges to their precursors in videogames or to their analog Girl/Boy Scout counterparts. In this sense, completing tasks, being recognized for accomplishments, collecting badges, and cooperation or competition adds a game-based layer to this method of visually tracking progress. Using digital badges in higher education can map student learning to course outcomes, program requirements, or institution-wide curriculum initiatives (possibly accreditation). This is of significance to academic libraries for teaching information literacy and explicitly integrating it throughout the curriculum.
I am looking for metadata structures suitable to describe historical tattoos. Are you aware of any projects which developed a metadata schema for such a purpose? Many thanks for your help.
How can I summarize metadata of a set of raster in a single file using ArcGIS ?
I am studying the importance of 'context' for Enterprise Information Systems. In most systems 'context' (this is the social dynamics in which a business process functions) is captured in metadata. This is a very limited way of capturing context, for it is impossible to capture social dynamics that way. I really would like to have the opinion of fellow scientists on this matter. Are there instuments, methods, or systems that allow for capturing social contexts? For the reconstruction of the past, absolutely necessary for accountability and governance, the possibility to contextualize past actions and transactions is an absolute necessity.
Are there software tools that help web developers to create metadata using schema.org vocabularies? do web developers need metadata training for creating those metadata?
Dear colleagues,
I'm asking for guidance about the software for S2 image processing due to its file format and the structure of metadata.
Thank you in advance,
Maria