Questions related to Metadata
The camera trap photos show the ambient temperature in the image, but it looks that the info it's not in the EXIF data
We are current doing a research which involves developing an internal DSL using metaprogramming in Ruby. It's not a clear topic to us and I would appreciate any good references/examples about that, please.
📢 IEA EBC Annex#81 calls for datasets of B2G services: The Annex#81 subtask C3 team is coordinating a survey to collect open datasets for building-to-grid services (B2G) and would like to kindly ask you to contribute by providing a description and link to your data using this Google Form (https://docs.google.com/forms/d/e/1FAIpQLSdqV6MxY0DiJUar9kdkXypXq7EhuhxLP9OzHaN7WjZ9xlFaOg/viewform). 🔖 This survey aims to collect data (time series and metadata) from buildings performing demand-response, demand-side management or energy flexibility. The dataset may be from existing, simulated, or semi-simulated hardware-in-the-loop buildings. The collected datasets will be used for: ➡️ Gathering use cases and assessing typical DR strategies, buildings types, energy systems and data requirements, ➡️ Testing energy flexibility KPIs for the review activity of C3 on existing KPIs for B2G services, ➡️ And, possibly, comparing the existing solutions against “grid challenges” from the proposed C3 web-based showcase platform. ⌛️This survey should take approximately 20-30 minutes to complete. Thank you in advance for your valuable contributions! Don't hesitate to share that post. Feel free to reach us (Hicham Johra: firstname.lastname@example.org; Flávia de Andrade Pereira: email@example.com) if you have any questions. #Annex81 #survey #dataset #B2G #KPIs #DR #DSM #datadriven #smartbuildings #flexibility #metadata #datarequirements
Given a satellite image without its metadata, can you tell the spatial resolution of the image using some analytical tools like python for instance?
I'm working on an update to our previous global geochemical database. At the moment, it contains a little over one million geochemical analyses. It contains some basic geochronology data, crystallization dates for igneous rocks and depositional dates for sedimentary rocks. The database differs from GEOROC and EarthChem, in that it includes some interpretive metadata and estimates of geophysical properties derived from the bulk chemistry. I'd like to expand these capabilities going forward.
What would you like to see added or improved?
Here's a link to the previous version:
I am trying to access the genomic (methylation, RNA and miRNA expression) along with its corresponding metadata (clinical and demographic) for this cohort TCGA Pan-Cancer (PANCAN)
(https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443). I tried both Python and R, I managed to get access to the cohort but I did not know how to load the datasets for expression or clinical data. I tried using XenaUCSTools package in R and XenaPython in Python.
Is there other packages that can load the specific datasets of gene expression or metadata? And is it possible to access this dataset through GEO?
I am looking for details in CXRs pertaining to the angle of rotation, missing lung portions (like the base and apex), under/over exposure, overlying anatomy (like Chin) and other details on the data quality. Are there any publicly available Chest X-ray collections that include these metadata?
[This classic Lenski paper] computes the effective population size of an evolving *E. coli* population subjected to daily population bottlenecks as $N_e = N_0 * g$, where $N_e$ is the effective population size, $N_0$ is the population size directly after the bottleneck and $g$ is the number of generations between bottlenecks.
Unfortunately, the formula was not derived in the referenced paper and the referenced articles appear to not describe the formula directly, but only provide the fundamentals for deriving it.
Can someone explain how this formula comes about?
Do there exist open data online sites where one can submit nematode micrographs, VCE videos, and image metadata? A content management system like Drupal could work. The level of effort and maintenance of such a site would be high.
Extending NemSyst (https://nemys.ugent.be/) with a citizen science add-on that allows users to submit observations?
I am trying to install the "FragBuilder" module in python in windows 10 in python 2.7 via conda. However, It fails again and again. The link which gives the command to run in windows is here: https://anaconda.org/bioconda/fragbuilder When I try to run it, it shows the following error message,
Collecting package metadata (current_repodata.json): ...working... done Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve. Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source. Collecting package metadata (repodata.json): .working... done Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve. Solving environment: ...working... Found conflicts! Looking for incompatible packages. This can take several minutes. Press CTRL-C to abort. failed Note: you may need to restart the kernel to use updated packages. Building graph of deps: 0%| | 0/5 [00:00<?, ?it/s] Examining @/win-64::__archspec==1=x86_64: 0%| | 0/5 [00:00<?, ?it/s] Examining python=2.7: 20%|## | 1/5 [00:00<?, ?it/s] Examining fragbuilder: 40%|#### | 2/5 [00:00<00:00, 11.79it/s] Examining fragbuilder: 60%|###### | 3/5 [00:00<00:00, 17.69it/s] Examining @/win-64::__cuda==10.2=0: 60%|###### | 3/5 [00:00<00:00, 17.69it/s] Examining @/win-64::__win==0=0: 80%|######## | 4/5 [00:00<00:00, 17.69it/s] Determining conflicts: 0%| | 0/5 [00:00<?, ?it/s] Examining conflict for fragbuilder python: 0%| | 0/5 [00:00<?, ?it/s] UnsatisfiableError: The following specifications were found to be incompatible with each other: Output in format: Requested package -> Available versions
Can someone help me fix this issue?
Edit: I checked the github document of this module and then installed fragbuilder module in python 2.7 as it cannot be supported in python 3.8 version.
According to RDF* specification is it possible to have the same tripla pattern with differents qualifiers?
( << s1, p1, o1>>, q1, v1 )
( << s1, p1, o1>>, q1, v2 )
( <<SPinera, president, Chile>>, start, 2010)
( <<SPinera, president, Chile>>, end, 2014)
( <<SPinera, president, Chile>>, start, 2018)
An RDF* triple is a 3-tuple that is defined recursively as follows: 1. Any RDF triple t ∈ (I∪B) ×I×(I∪B∪L) is an RDF* triple; and 2. Given RDF* triples t and t′, and RDF terms s ∈ (I ∪ B), p ∈ I and o ∈ (I ∪ B ∪ L), then the tuples (t,p,o), (s,p,t) and (t,p,t′) are RDF* triples.
Reference for RDF* definition
Hartig, Olaf. “Foundations of RDF⋆ and SPARQL⋆ (An Alternative Approach to Statement-Level Metadata in RDF).” AMW (2017).
I've submitted some sequences with incorrect metadata. Already sent an email to NCBI. However, it would be way faster if I could just delete the records and resubmit them.
Thank you very much for your time and patience.
Accounting traditionally is presented as describing efficiently flux (what comes in, and goes out) and stock (what is held, at a given time), and as debit and credit. It is also about matching the terms of an exchange.
How can we move the model beyond the basic number-based description, into more data-rich (including metadata, descriptors, etc) frameworks, while benefiting from the deep and long experience of accounting over human history?
With Matrices of sets , a first endeavour was made to describe objects rather than numbers attached to them (price, quantity, measurements and features).
With Matrices of L-sets  we are going one step further, distinguishing actual assets (as classical sets) and wish lists, orders, needs, requirements which are not yet owned or available. We show how an operational and computable framework can be obtained with such objects.
Presentation Matrices of Sets, complete tutorial with use cases
Preprint Matrices of L-sets -Meaning and use
Is there any standardized open format for storing mobile mapping data?
- Exteroiur and Interiour Orientation
Due to the huge amunt of data and single files, a format optimized for data transfer tasks would be useful as well.
Do you know such Formats?
I have fluorescent microscopy images in LIF format. There were 3 fluorescent dyes that I imaged with 3 channels. two of the three signals are detected in very similar areas of on my sample, they are difficult to tell apart. Basically I do not remember which image is in the second chanel and which is in the third. I tried looking at the OME metadata but there wasn't anything useful I could find. I would like to know which channel captured which wavelengths. Is there a simple way to get this information?
I am recently studying X-ray powder diffraction measurements made with a D4 Endeavor. I can load the metadata from Profex-BGMN but I don't understand the meaning of the instrument metadata variables (and units), so I cannot prepare the instrument contribution to run the refinement. Any help from Bruker users?
I want to convert my Image stack file (OME-TIFF image stacks) files to separate image files (three greyscale TIFFs plus metadata text files).
Is it possible for you to help me with this?
GSAID renders most of the resources for SARS CoV-2. I need to download and analyze the metadata of the sequences(only those which have patient status: dead/ deceased ) from the USA. But I need to check, select and download every sequence (whether it comes from a dead body or not). How to do that in a bulk not manually?
I've successfully uploaded my metadata file and assembled, shotgun metagenomic dataet to my MG-RAST inbox. The metadata passed the QC-like step, my sequwnce data uploaded just fine and I could happily move on to the submission step. Everything works beautifully until I get to the tab where I submit my sequence data, which is there but in red text and I can't select it.
Has anyone else had this issue and knows what the problem is?
I have deleted and re-uploaded both the sequence and metadata files but with no success. Any help/advice would be hugely appreciated!!
Thanks in advance.
I am currently working on the metadata of coral reef-associated bacterial communities with R. I would like to display a pie chart with two layers, including the inner layer(Phylum) and outer(Class). I don't know how to write a script to visualize this data. Can anybody help me with this? Thank you very much!
My Landsat 8 (Collection 2 Level 2) 7-band layer stack has a cyan tint across the entire image. Do I need to apply any kind of correction? Level 2 Landsat 8 imagery should be corrected to surface reflectance, but how can I read the metadata to be certain about what level processing the data had?
How can I read the reflectance values in ENVI to see if my data is ready to use?
I am new to ENVI and attempting to do a radiometric calibration on Sentinel 2 L1C images. The metadata in the ENVI header shows 0.000 for the gain and offset values of all of the bands. I believe this is why I am getting "no data" when I attempt the radiometric calibration.
Where can I find or calculate the gain and offset values to input them using the "Apply Gain and Offset" tool. I haven't found this information in the metadata, but I probably don't know what to look for.
Developing a metadata on quantitative effects estimation, what tool could be more friendly and accessible for metadata handling?
I´m working on a project proposal that will develop a historical event-based model and a historical SDI (snapshot model). I´m wondering which metadata standard suits better for both. The study case is in the Iberian Peninsula.
I need to perform ligand-receptor interactions map for the data of bulk RNA sequencing (mouse). In all methods which I found they want to have matrix with columns of gene symbol and mean expression values for each cell type. I have only tsv files with metadata and counts. Do you know how to get this from the data I have. Is there any R library/protocol/tutorial for that? Which method you suggest for obtaining receptor-ligand Interactome for bulk RNA?
Here is how my metadata looks like:
id nCount_RNA nFeature_RNA PercentMito ERCCCounts PercentERCC Animal Plate
X11_E1 569589 11505 0.00331115945006 20 3.51E-05 11 11 X11A10
Birthdate Gender Organ CellType RowID ColID
old Female BM GMP E 1
gene X11_E1 X11A10 X11A12 X11A3 X11A5 ........
Gnai3 23 4 22 25 94 ..........
How can I use R to extract the time interval between visits of different species from camera trap images?
For example, Species A arrives at 07:00 (6 photos, each 1 min apart)
Species B arrives at 07:45 (so a 45 min time interval, or more precisely a 39 min time interval).
I have 10's of thousands of images, and I have used camtrapR to extract the record table, but the deltatime just tells me the time interval between photos of the same species same visit, not inter-species. I want to investigate the effect of species A's presence on species B.
Hello Research Gate community,
I have a question about my interpretation of capscale() in the vegan R pachage and how to assess the variance explained by the interaction effect.
Imagine a significant model like this: Var1 + Var2 + Var1:Var2
> RsquareAdj(capscale(otu_table ~ Var1 + Var2 + Var1:Var2, metadata, distance = "bray"))$adj.r.squared
Then I can obtain the variance of the main factors
> RsquareAdj(capscale(otu_table ~ Var1 , metadata, distance = "bray"))$adj.r.squared
> RsquareAdj(capscale(otu_table ~ Var2, metadata, distance = "bray"))$adj.r.squared
Then, is this the right way to calculate the Adj.R2 for the interaction?
> RsquareAdj(capscale(otu_table~ Var1:Var2 + Condition(Var1) + Condition(Var2), metadata, distance = "bray"))$adj.r.squared
However, if I sum the variances altogether I do not get the variance explained by the full model
0.09308548 + 0.1270805 + 0.05174793 = 0.2719139
I looked online but I could not find any decent explanation of this.
Thank you for your help!
I need a simple data catalog to manage some datasets and to visualize metadata about them (descriptive statistics, variables’ description, …). Datasets are CSV files in several folders and some tables in an SQLite database, everything stored in a Mac desktop. What solution would you recommend?
The two mains I know are Ckan and Amundsen, but I find them difficult to setup and use. Do you know any simple software to manage this kind of task?
Hi, I am quite a newbie with python, and I need to run some text mining analysis on 100+ literary texts in German, which I have stored as individual txt files in a folder. They are with the scheme author_title_date (for example "schnitzler_else_1924.txt").
I was thinking of using the python package nltk and/or spaCy, and maybe the Stanford NER, as I need to analyse sentiments in the different texts and to identify specific locations as well as the sentiments in relation to such locations.
I am stuck on a very preliminary passage though: how do I import the all the text files from the folder to a single corpus/vector corpus that retains the metadata in the title? I could relatively easily produce that in R with TM, but I can't find a way to do it in python. Thanks!
I am trying to extract some information (metadata) from GenBank using the R package "rentrez" and the example I found here https://ajrominger.github.io/2018/05/21/gettingDNA.html. Specifically, for my group of interest, I search for all records that have geographical coordinates and then want to extract data about the accession number, taxon, sequenced locus, country, lat_long, and collection date. As an output, I want a csv file with the data for each record in a separate row. I am attaching the script I have constructed and it seems it can do the job but at some point, rows get muddled with data from different records overlapping the neighbouring rows. For example, from the 157 records that rentrez retrieves from NCBI the first 109 records in the resulting file look like what I want to achieve but the rest is a total mess. I suspect this happens because the XML contents differ a bit between the GenBank entries but cannot figure out how to fix the problem. Any help would be greatly appreciated because I am a newbie with R and figuring out each step takes a lot of time. Thanks in advance!
I am working with Landsat 5 TM images. Unlike previous USGS landsat products that used to contain metadata in the TXT format, the new products are in the XML format and they are atmospherically corrected scenes. So I can't use ENVI 5 to do the radiometric corrections, as ENVI only supports TXT format of metadata.
As I try to use the NDWI = (Band2 - Band4)/(Band2 + Band4) calculation, I get a range of -6.2 to 6.1 instead of -1 to +1. This might be due to high spectral reflectance of the atmospherically corrected image.
How to resolve this issue?
I am Looking for wave directions data in the southern coast of the Caspian sea, to draw wave-rose at Anzali port, Amirabad port, Babolsar, Nowshahr port etc. I already have got the Iranian National Institute for Oceanography and Atmospheric Science 's metadata, but was not helpful. Is there any other online provider?
(time history data at one point at an area around each port is adequate)
I have used Fitz so far to extract things like Font, Text size, etc in a pdf file. What are some other good tools/ libraries that can extract some meaningful metadata about the contents of the pdf?
Some libraries also store the metadata of complete pdf rather than the content. ( Not looking for those) .
Any techniques are also helpful for the same.
Is there a library or technique to access this information. Also, can we target metadata information in some specific part of the pdf document?
Nowadays, the Spanish Cybermetrics Lab ranks about 25 thousand universities and about 2 thousand OA-Repositories.
The third partial indicator “openness” of Webometric University Ranking depends on the filling of university OA-Repositories since Google Scholar finds “rich files” in these OA-Repositories.
This, in return is attributed to Google Scholar and OA-Repositories which operates as part of a single Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The above mentioned case history can be emphasized by the link http://www.bsu.edu.ru/en/news/detail.php?ID=318290 on Webometric Ranking of Belgorod State University (Russia).
I’m proposing that the university management should pay much attention on the above mentioned effect. In this context, how is the situation in other universities?
Digital Object Identifier is a unique number assigned to scholarly articles to ensure their International Standardization and for easier identification of published articles even if the metadata URL is changed. Should all publishers register and assign DOI to their articles? Should DOI assignment be used as one of the key criteria in choosing a good journal by authors?
Hi. I activated qiime 2 2019, prepared a folder named '6 sample file' to try data import which has metadata, manifest, and 6 data sequences as in the picture. I received this error. maybe as manifest and metadata are not qza (I read the tutorial but could not find how to convert into qza. Thanks a lot
(1/1?) no such option: -q
(qiime2-2019.7) a-PC:~ hebahussein$
(qiime2-2019.7) a-PC:~ hebahussein$ qiime tools import \
> --type 'SampleData[PairedEndSequencesWithQuality]' \
> --input-path 6 sample import \
> --output-path paired-end-demux.qza \
> --input-format PairedEndFastqManifestPhred64V2
Based on our successful experience in establishing the SWIG database on infiltration data, we decided to initiate another project to collect soil hydraulic properties and their metadata. Currently available databases that provide estimates of the soil water retention curve and the hydraulic conductivity curve use pedotransfer functions that do not include information about the structural properties. We are therefore seeking original soil hydraulic data (soil water retention curve, hydraulic conductivity curve, and air permeability) obtained on undisturbed soil samples and their corresponding soil properties (e.g. texture, carbon content, bulk density, taxonomic and land-use information) as well as quantitative and/or qualitative information on the soil structure. These structural data may include:
. Taxonomic information (Blocky, granular, or etc.)
. Ksat on undisturbed soils
. Pore size data (obtained e.g. from nuclear magnetic resonance (NMR) relaxometry)
. Aggregates stability
. Geometric mean or mean weight diameter (GMD or MWD) of aggregates
. Fractal dimension .
. Mechanical resistance or etc.
Any metadata describing the experimental data in more detail are welcomed. Data may stem from lab scale, plot, field and catchment scale experiments. We are planning to establish a database of these data that we will submit to a data journal (e.g. Earth System Data Science or any other suitable journal). In order to honor your support, we would like to include you as co-author on the manuscript. We hope that you will contribute to this initiative and welcome any questions you might have.
Please use the enclosed template file (excel file) to share data with us.
We would also appreciate if you could kindly distribute this request to colleagues that might be interested in contributing to this effort.
Our published article in SWIG database is provided in following link for those people who we have had no luck to have their supports for that project.
Mehdi Rahmati (firstname.lastname@example.org)
Rainer Horn (email@example.com )
Harry Vereecken (firstname.lastname@example.org )
Lutz Weihermüller (email@example.com )
Hi everybody, I have problems in opening a .xdce file. The file is supposed to contain several series of images taken with the microscope InCell 2200. I tried to open the file by using either ImageJ+Bioformats package and Fiji, but I was not able to visualize the images. I mean, when I try to open the file it asks me which serie I want to open, it opens the file, I can see the metadata, but the images are black. Trying to modify the brightness and contrast doesn't work, it's like there is no fluorescence. This warning appears all the times I try to open the file: "[WARN] Unknown ExperimentType value 'Fluorescence' will be stored as "Other". I don't understand whether the file is damaged or it's me I don't open the image properly. I fear an error occurred during the trasfering of the folder containing the images from the computer of the microscope and my hard disk, because within the folder there is only one file (it is supposed to containing several series of images) and also because my computer doesn't seem to recognize the file xdce (the file has no icons, like when the computer doesn't know which app to use to open a file). Reading the ImageJ guide I understood this file extension should be supported by ImageJ and Fiji. Thank you in advance!
I have several years of data from various projects that I'd like to build into a database, but don't know how. Microsoft Excel is useful to tabulate the individual dataset but not to group it all together. My data consists of benthic species lists and abundances - with metadata such as client, project number, project name, station names, station coordinates, and more.
Thanks in advance,
I am watching qiime importing metadata tutorial. I downloaded the barcode and sequence file using url and tried to open in qiime but they do not work as follow:
(micro) a-PC:desktop hebahussein$ cd emp-single-end-sequences
(micro) a-PC:emp-single-end-sequences hebahussein$ qiime tools import \
> --type EMPSingleEndSequences \
> --input-path emp-single-end-sequences \
> --output-path emp-single-end-sequences.qza
-bash: qiime: command not found
(micro) a-PC:emp-single-end-sequences hebahussein$
Five major differences between data and data stores:
1. Data Lakes Retain All Data
During the development of a data warehouse, a considerable amount of time is spent analyzing data sources, understanding business processes and profiling data. The result is a highly structured data model designed for reporting. A large part of this process includes making decisions about what data to include and to not include in the warehouse. Generally, if data isn’t used to answer specific questions or in a defined report, it may be excluded from the warehouse. This is usually done to simplify the data model and also to conserve space on expensive disk storage that is used to make the data warehouse performant.
In contrast, the data lake retains ALL data. Not just data that is in use today but data that may be used and even data that may never be used just because it MIGHT be used someday. Data is also kept for all time so that we can go back in time to any point to do analysis.
This approach becomes possible because the hardware for a data lake usually differs greatly from that used for a data warehouse. Commodity, off-the-shelf servers combined with cheap storage makes scaling a data lake to terabytes and petabytes fairly economical.
2. Data Lakes Support All Data Types
Data warehouses generally consist of data extracted from transactional systems and consist of quantitative metrics and the attributes that describe them. Non-traditional data sources such as web server logs, sensor data, social network activity, text and images are largely ignored. New uses for these data types continue to be found but consuming and storing them can be expensive and difficult.
The data lake approach embraces these non-traditional data types. In the data lake, we keep all data regardless of source and structure. We keep it in its raw form and we only transform it when we’re ready to use it. This approach is known as “Schema on Read” vs. the “Schema on Write” approach used in the data warehouse.
3. Data Lakes Support All Users
In most organizations, 80% or more of users are “operational”. They want to get their reports, see their key performance metrics or slice the same set of data in a spreadsheet every day. The data warehouse is usually ideal for these users because it is well structured, easy to use and understand and it is purpose-built to answer their questions.
The next 10% or so, do more analysis on the data. They use the data warehouse as a source but often go back to source systems to get data that is not included in the warehouse and sometimes bring in data from outside the organization. Their favorite tool is the spreadsheet and they create new reports that are often distributed throughout the organization. The data warehouse is their go-to source for data but they often go beyond its bounds
Finally, the last few percent of users do deep analysis. They may create totally new data sources based on research. They mash up many different types of data and come up with entirely new questions to be answered. These users may use the data warehouse but often ignore it as they are usually charged with going beyond its capabilities. These users include the Data Scientists and they may use advanced analytic tools and capabilities like statistical analysis and predictive modeling.
The data lake approach supports all of these users equally well. The data scientists can go to the lake and work with the very large and varied data sets they need while other users make use of more structured views of the data provided for their use.
4. Data Lakes Adapt Easily to Changes
One of the chief complaints about data warehouses is how long it takes to change them. Considerable time is spent up front during development getting the warehouse’s structure right. A good warehouse design can adapt to change but because of the complexity of the data loading process and the work done to make analysis and reporting easy, these changes will necessarily consume some developer resources and take some time.
Many business questions can’t wait for the data warehouse team to adapt their system to answer them. The ever increasing need for faster answers is what has given rise to the concept of self-service business intelligence.
In the data lake on the other hand, since all data is stored in its raw form and is always accessible to someone who needs to use it, users are empowered to go beyond the structure of the warehouse to explore data in novel ways and answer their questions at their pace.
If the result of an exploration is shown to be useful and there is a desire to repeat it, then a more formal schema can be applied to it and automation and reusability can be developed to help extend the results to a broader audience. If it is determined that the result is not useful, it can be discarded and no changes to the data structures have been made and no development resources have been consumed.
5. Data Lakes Provide Faster Insights
This last difference is really the result of the other four. Because data lakes contain all data and data types, because it enables users to access data before it has been transformed, cleansed and structured it enables users to get to their results faster than the traditional data warehouse approach.
However, this early access to the data comes at a price. The work typically done by the data warehouse development team may not be done for some or all of the data sources required to do an analysis. This leaves users in the driver’s seat to explore and use the data as they see fit but the first tier of business users I described above may not want to do that work. They still just want their reports and KPI’s.
I have raw images without any associated metadata. I believe ¨exposure time¨ is essential for performing HDR composition, is there anyway or tool to estimate the exposure time? or is there any algorithm that do the composition without this data?
I faced with a problem in Qiime2 when I was demultiplexing my data. The error is below:
There was an issue with loading the file Barcodemetadate.txt as metadata:
Metadata file path doesn’t exist, or the path points to something other than a file. Please check that the path exists, has read permissions, and points to a regular file (not a directory): Barcodemetadate.txt
There may be more errors present in the metadata file. To get a full report, sample/feature metadata files can be validated with Keemei: https://keemei.qiime2.org
Find details on QIIME 2 metadata requirements here: https://docs.qiime2.org/2019.1/tutorials/metadata/
I have already checked my metadata file with Keemei. It sent me ‘Good Job’! i did not have any error on Keemei!
My data have barcodes which they must be demultiplexed then denoised. I used the only option in the tutorial- qiime demux emp-paired.
My data is Casava 1.8, already imported to .qza artifact. The next step is multiplexing.
Metadata is ready and already cheeked by Keemei. It reported no errors.
The command used I is:
qiime demux emp-paired \
--m-barcodes-file MyMetadataFileName.txt \
--m-barcodes-column Columename \
--i-seqs Artifcatname.qza \
--o-per-sample-sequences demux.qza \
Who can send me a relevant command to demultiplex my data? Or Somebody has an any Idea?
In data science there are three basic trends as far as my study (may be i am not correct). These trends include metadata solutions, data patterns recognition solutions and classifying data into classes.
My question is what can be the future possible directions in the data science.
I am looking for an R package and Python library for metadata development. Any suggestions will be highly appreciated. Below is more information about what I want.
I am working on a data project that requires me to create and store metadata either directly inside the data-frame or externally to the data frame, but the important thing is that users must be able to update the metadata as more discoveries about the datasets are uncovered. The metadata must be able to store all sorts of information such as the structure of the data, properties of the datasets, size, creator, the creation time of each dataset with the hope that the metadata can be explored and visualised separately in the future.
This project is looking for soil organic carbon data and changes in China.
We are seeking your support in collecting SOC data to predict and map SOC. What we would need are the SOC, bulk density, and soil texture contents for entire China or any provinces or any district in China.
Any metadata in more detail are welcomed.
Thanks in advance.
I am trying to read metadata file (*MTL.TXT) of Landsat 8 in python 2.7.12. I have written few functions. I want to return only specific values back or may be store them in global variable so that i can access them outside the function. Inside the function i am able to access those specific values. My aim is to convert DN values to reflectance.
Can we say that the metadata related to the data, and Context data related to objects.
and also, Is there a relation between context data context aware.
I am trying to perform Metadata analyses with the METAFOR package (R) and I would like to learn more about the package potential. I have doubts about the model calculations and I would like to verify my analyses with a real example.
If you are somehow familiar with the use of ontologies, please consider answering this survey.
In the context of the recent reconfiguration of the RDA Vocabulary Semantic Services Interest Group , we are leading a task group focused on ontology metadata.
The goal of this task group is to review the current state related to metadata description of semantic resources in general (ontologies, vocabularies, thesauri and terminologies); and discuss recommendations and best practices in terms of metadata standards for ontologies.
To give insights to the group and drive our work, we have set up a survey on the topic and would really appreciate your participation.
The survey is anonymous and uses GForm. It should take about 10-15 minutes.
Please take a moment to send your feedback.
Thank you for your help
Clement Jonquet, Biswanath Dutta, Anne Toulet and Barbara Magana
I can request historical NOAA weather station data before 1997 for about 260 U.S. stations. However, there are around 6000 land-based U.S. stations available through MesoWest and the metadata tells me that some of these stations started before 1997. MesoWest archived data beginning with 1997. Is there a way to access data before 1997 for these MesoWest stations, eventually through a third data provider?
I have thousands of images that I want to construct a database around (in excel), beginning with info about them available in their metadata (e.g. date, camera name) which is viewable in windows when browsing files and their properties. I don't want to enter these individually. My second goal is to automatically create hyperlinks to their file location on my machine where a link in the database could be clicked and the file launched with a default program. Is this easily possible without purchasing specialized software to do this? Perhaps some macros written in excel could do this?
I would like to add some data to a SPSS file I want to make public. Is there any option, how to include details about the observations, acronyms etc.?
Scholars I am looking for a suitable metadata for digital heritage objects, text, images and AV.
Your help will be greatly appreciated.
I am working on the topic "Utility Enhancement for Textual Document redaction and Sanitization". I have noted in the literature of de-identification of the medical document that Privacy models perform unnecessary sanitization by sanitizing the negated assertions, (“AIDS negative”). I want to exclude the negated assertions before sanitizing the medical document, which will improve the utility of document. I want to know which dataset will be appropriate for my work. I tried to use the 2010 i2b2 dataset but I could not find the metadata of that dataset. The 2014 i2b2 de-identification Challenge Task 1 consists of 1304 medical records with respect to 296 patients, of which 790 records (178 patients) are used for training, and the remaining 514 records (118 patients) for testing. The medical records are a fully annotated gold standard set of clinical narratives. The PHI categories are grouped into seven main categories with 25 associated sub-categories. Distributions of PHI categories in the training and test corpora are known as in the test corpora 764 age, 4980 dates, hospitals 875, etc. I want to know the same above information for 2010 i2b2 dataset that I could not find yet.
I am doing my dissertation about metadata using video but I don't know how to gather much information about the video statistic. Any thought or suggestion? I hope i could see which part of my video that people most see to capture the pattern of my video.
Sentinel-2 imagery granules downloaded from AWS archives (http://sentinel-pds.s3-website.eu-central-1.amazonaws.com) have a little different data structure than the original granule sub-folders of ESA SAFE product. Consequently, it seems the SNAP tool is not able to import the data as one product with metadata - and I need to process the data to level 2 (atmospherically corrected ground reflectance). Is there a way? Or is there a site which allows downloading individual granules in original SAFE data structure?
(question originally asked on stackexchange)
We are trying to implement a web based catalog for our photo library. One important thing is the compatibility among the software and the IPTC metadata
The issue is that my target area from different days doesn't overlap each other. When I use UTM coordinates to reference instead of lat-lon values, the results are even worse.
Here I describe my methodology which might help to debug the issue.
I am trying to use the Landsat 8 data to track temporal changes on land surface (for a small area about 15*15 km and about 512*512 pixels of Landsat 8 image). The satellite goes over the same area (lets call it target) on earth after 16 days approximately. Since the position covered by the satellite changes slightly after 16 days, there is a need to reference the different day satellite image in order to overlay images from different dates.
To locate target on different temporal layers of Landsat, I use the width and height of images and also the coordinates(UTM) mentioned in the metadata file. Example metadata file (https://s3-us-west-2.amazonaws.com/landsat-pds/L8/043/024/LC80430242013259LGN00/LC80430242013259LGN00_MTL.txt)
The X and Y UTM coordinates change linearly with the pixels of the image (my assumption). So, I use formulas like these to obtain the pixels i, j where the boundaries of my target start.
i = ((width / (maxX - minX) * (X-minX)));
j = ((height / (minY - maxY) * (Y-maxY)));
Here minX, minY, maxX and maxY are the boundaries of the image in UTM coordinates. width and height represent the number of columns and rows in image to represent pixels. X & Y represent the absolute coordinates (UTM) where the upper left boundary of the target is located. So, i and j represent the pixel coordinates where the upper left boundary of the target starts. I also tried similar formulas with lat-lon coordinates.
I create image of my target area (512*512 pixels) by taking pixels in the area (i,i+512) and (j,j+512).
Now, the target area from different is approximately overlayed well when I use lat-lon coordinates. The target area is quite displaced when I use UTM
coordinates. I expected the UTM coordinates to give accurate results and it doesn't make sense why the overlays are not coming out to be accurate.
any research need library for categorizing and citing other research, many tools available but I prefer the open source Zotero, which integrated with FireFox and word and open office.
it's easy to add items to zotero and categorize them using firefox, then use them in documents at word or OOO
the sharing is also important, using zotero server you can backup ur library remotely, sync between multi devices, and share part of it inside groups libraray
unfortunately zotero don't support social research sites like research gate or acadymia because no metadata linked to research items.
also groups don't have discussion tools, it's only shared library.
what is yours