Questions related to DNA Sequence Analysis
Last years I’ve been using the classics Bioedit, MEGA, Arlequin, Dnasp, Haploviewer, JModelTest... to manipulate and compute statistics or NJ trees from sequence data. I wonder whether there is a good alternative (or a brief selection) of packages in R to do the same thing.
I have been stuck to find lncRNA tools to annotate the file that have DNA sequence in them. I have four GEO accession which are GSE55191, GSE89186, GSE58043, and GSE64631. All four of them have done the GEO2R analysis separately and then combine the GEO2R result using the Venn diagram to gain the overlapping ID. So, I want to annotate the sequence which I can access using Windows.
Or is there any other way to use the DNA sequence for lncRNA research?
Hi everyone, I am new to molecular work and I want to know how to use the Unite database for my dna sequences. Is there a tutorial for complete beginners?
- Telomeres are distinctive structures found at the ends of our chromosomes. They consist of the same short DNA sequence repeated over and over again. They protect the ends of our chromosomes by forming a cap, much like the plastic tip on shoelaces. If the telomeres were not there, our chromosomes may end up sticking to other chromosomes. When the telomere becomes too short, the chromosome reaches a ‘critical length’ and can no longer be replicated. This ’critical length’ triggers the cell to die by a process called ((apoptosis)) also known as programmed cell death.
1) I need an explanation of the following statement from a protein expression review article
" The only drawback of these vectors is that they typically have only a single restriction site for N-terminal cloning (NcoI). Thus, if you have a NcoI site in the DNA sequence of your “to-be-expressed” protein, then you cannot use these vectors for cloning unless they are genetically modified to incorporate a new restriction site. Because proteins often have multiple NcoI sites, including proteins we are studying in our laboratories, we modified a subset of these vectors to include an NheI site immediately following the NcoI site, allowing these modified vectors to be used to subclone DNA sequences with NcoI sites"
My question is why vectors having just one NcoI cloning site cannot be used to subclone DNA sequences with NcoI sites? and what the difference that the other site (Nhel) would make to fix this issue?
2) when to use a restriction enzyme that creates a blunt or sticky ends? I know the difference between these ends, but I do not understand why to use a restriction enzyme that creates one of these two specific ends?
I'd like to engineer a probiotic to secrete GHRP-6. Because GHRP-6 contains several D-form amino acids, could it be possible to transform bacteria with DNA sequences encoding the 6 same amino acids having the same function? That is, does GHRP-6 composed of all L-form amino acids have the function? What's the purpose of D-form amino acids in GHRP-6?
We would like to know the best value for money commercial company for DNA sequencing as part of an RNA-seq study.
Thanks you. Joe Duffy
In the entire primer sequence there is a nucleotide that does not match its complementary base in the template DNA sequence.
Through Arlequin, I'm hoping to learn how to study population genetics (e.g., Fst values, Hardy-Weinberg analyses) using DNA sequences. Perhaps you could give screenshots of the files you have or a step-by-step guide on how to perform it. All I know is how to use Arlequin to analyze microsatellite data.
Your response will be greatly appreciated!
Hi, I had recently two non-standard DNA sequencing outputs. I'd be curious if anybody could advice what could possibly go wrong, so that I can avoid such mistakes in the future.
First, pBA002a_rev. It appears like mixed DNA? But only up to something like 294 nt, then it looks OK. But when I BLASTed that sequence, it's E.coli chromosomal DNA. So I suppose my plasmid was contaminated by E. coli chromosomal DNA and it happened to contain sequence similar enough to my primer? But why would the mixture stop after 290 nt?
Second are gAtNSH1_S_7_1 and gAtNSH1_T_6_1. These are from cloning the same gene, but different primers (with/without STOP codon) and independent colonies. But if you check around position 782/783, they look very similar. Before that position, the intensity is around 1500, while very shortly after that position, the intensity is only around 500. Moreover, this specific position - the first G in TGGT - is actually T to G mutation. But it is already in the plasmid, so I wouldn't expect a mutation there. Plus it's the same in two independent clones. Any idea, what could've happened here?
I have introduced 1 base mutation which cause 1 amino acid changed by PCR then transformed this plasmid to DH5-alpha. At this step, I checked desire transformants byspread on selective media (LB+Amp) and picked up some colony for DNA sequencing. After I got the right point mutation plasmid, I transformed it to yeast using auxotrophic marker (SD without Ura). I've got lots of yeast transformants, this come to my question, how can i make sure that my desire plasmid is transformed to the yeast cell. Another question is about the molecular knowledges that my yeast transformant contains its original gene and the mutate gene from plasmid, how can i determine that this mutate gene is affected to cell metalism since there is another original gene in the cell.
I would like to know how to get all possible DNA sequences that encode every 20 amino acid (aa means 60nt) frame in protein X and all possible aa substitutions at each residue of this protein.
I want every 20 amino acids in protein X to be linked with 10nt-barcode (huge library, 20 power20) and then test the expression of this protein using high-throughput sequencing. I would like first to get a few sequences of this system. If this works, I will order libraries that combine barcodes-10nt with protein X-60nt and then test the whole library expression using deep sequencing.
The sequence I want looks like this before cloning it to the vector:
NNNNN-[Barcode]--XXX--protein X-60nt(20 aa)]-NNNNN
XXX are 3 restriction endonucleases.
Please help me out with any available tools or websites to do this.
With Thermo-Fisher no longer providing support in the form of service/maintenance contracts for their range of ABI 3130/3130XL DNA Sequencers, does anyone here know of another company based in Europe who can provide such services for these machines? I will also consider companies providing ad-hoc repair services.
Thank you for your time,
I am looking forward to explore / validate the role of a drug in breast cancer treatment particularly in the absence of mismatch repair proteins via in-silico analysis. Picking random DNA could not serve the purpose. Is there anyone who can suggest about the selection of DNA sequence / structure... the DNA needed shouldn't be more then 12 to 14 nucleotides and should be specific to that particular protein.
I am trying to dock the drug with DNA (with or without MMR proteins).
Hello! I am trying to do a new report on happiness and how it is embedded on our DNA. I was wondering if anyone knew the DNA sequence for happiness? I am trying to compare how happiness looks like in a person that is not diagnosed for depression and how it looks in a person that does have depression.
I am in the process of sequencing an antibody using DNA sequencing in addition to de novo protein sequencing using mass spectrometry. When comparing the results I seem to have differences between the two results. Is this common, and are there any explanations for this result?
I know that several genes come one after the other under a single promoter in an operon, but what is exactly between those genes? Does the start codon of the second ORF come right after the ORF of the first gene? If there is a specific example with the dna sequence, that would be great.
Also, does an operon always require an operator?
I am interesting what is in bioinformatic most requested from companies and academia to find easiest a job? DNA sequence analysis or? Can someone suggest that if you know how to work with something would be in moest companies requested and have good chance to get a job? and is in industry and academia same situation, or is acadamia are different things needed?
What you suggest?
After DNA sequencing, I have to perform proofreading and quality trimming of forward and reverse DNA sequences that will be assembled to come up with the full length of the amplified DNA fragment to ensure only high-quality data is used for downstream analyses such as BLAST analysis and phylogenetic analysis.
So my question is how to assemble high quality reads of the forward and reverse DNA sequences using codon code aligner or any other software?
I have several PCR products about 400-500bp length. How should I fragment these PCR products to get 250bp DNA? I haven't any restrictases and have only Covaris sonicator, does it fit well for the fragmentation of so short DNA sequences? I utilized it before to fragment genomic DNA but I feel some doubts about its capability to fragment so short PCR products.
I have a problem and I don't know how to answer this question. I did RLFP PCR in patient X for the presence of a specific pathogenic mutation. I got no mutation. Since the patient looked sick, DNA sequencing was done. In sequencing, the mutation came out. I would like to add that the patient was repeated with both the first and the second method. At the same time, controls were made in which the mutation tested had emerged in RLFP PCR. This was the first time I had encountered such a situation. How to explain it. I would like to add that it is a heterozygous mutation.
I have treated some of my plasmid samples with RNase A to get rid of their RNA contamination.
I'm going to send the samples for sequencing afterward.
Is it necessary to clean up my samples after RNase treatment?
Does RNase A make any problem in the DNA sequencing process?
Does the orientation of relatively short (eg between about 40-100bp) bait DNA sequence in a Y1H assay matter that much? Or is it very much sequence-transcription factor dependent?
The transcription factor I am using is fused to the GAL4 AD at the c-terminus, so I am wondering if the orientation is important, then the GAL4 may be pointed away from the reporter gene start site, thus reducing expression.
Hello dear researchers
Would you please introduce a Widely used and suitable database for protein, carbohydrate, RNA, and DNA sequences?
The purpose of my model is to predict protein with peptides, DNA, RNA, and carbohydrates
I am waiting for your guidance
We plan to carry out a RNAseq project with BGI but before we proceed I wanted to know if anyone has used this service and if there were any issues one had with it.
With advancing science , more and more higher throughput technologies are available. Any one lab cannot claim to be expert in all technologies. To remain updated and learn better techniques, a student has to attend workshops and hands on training in labs proficient in such technologies. Array CGH, next generation sequencing, proteomics and metabolomics are now coming up. The brighter and sincere students can opt for these advanced technologies for their PhD work. Sometimes, some private laboratories make research work done in more easier way. However, for the benefit of the student, and also for his/her future career; the person should learn few of the best technologies available in the country or around the world. Any suggestions from your experiences?
I have been working on recombinant p53R248Q and p53WT. It is necessary to show that the proteins are functional. For the wild type, I can show that the protein specifically binds to cognate DNA sequence. For p53R248 this doesn't hold true because the mutation abrogates DNA binding activity. So how do I show that p53R248 protein that we expressed and purified is what it is from a functional point of view. Is showing loss of function compared to the wild type protein enough?
is there, by any means, we have the full complete 100 million nucleotides of the DNA sequence of humans?
I want to know where can i find the DNA library which have our DNA and other DNA sequences
I have a list of almost 3000 DNA sequences where each sequence is separated by '>'. Using BioPython I can convert each DNA sequence to its corresponding mRNA transcript by giving input manually. But as I have a large number of sequences to work on, it is a very lengthy procedure. Is there any option to automate the input procedure so that It will automatically take input DNA sequences from the file itself?
Thank you in advance.
For identification of Bacteria, 16S DNA sequence is important. I would like to know whether partial sequence would suffice for it or complete sequence provide better clarity. in addition to that some time partial sequence matches with the some part of large sequence of 16S DNA of bacteria in that scenario how can we determine that the whether bacteria is novel or not
I want to go more on my analysis on DNA sequence, I want to cover all properties that nucleotides may have, so what about charge property, could it be available by default on bases, or DNA sequence needs to pass through current to get charged and by which rules the charge is distributed among nucleotides, and whether the charge is per base or for all DNA sequence?
I have found several eukaryotic promoters/genes. However, if I wan to express a sgRNA in E. coli, it should be possible. Am I wrong?
I want to encode Network packets (Commands & Attributes) using DNA sequence, but I want to map those network commands into a meaningful featured DNA sequence by the meaning of finally when you see the encoded network command, each codon has logical relationship with successive one, beside each codon needs to refer to existing Amino Acid.
Can we build DNA sequence that contains features rather than just letters (A,G,C,T) sequenced randomly beside each other? Does this have reference in Biology, what is your thought in this?
After the SDM (Site-Directed Mutagenesis ) on a sequence of the protein from Myxococcus xanthus, then cloned in E. coli DH5 alpha, the sequencing results showed a long region disappeared (1807bp). However, comparing with the WT plasmid on the electrophoresis gel, the length of DNA looked similar (a little bit higher than the WT).
I am considering the repair mechanism of E. coli. Did you guys meet the same problem on your bench? Could you give some paper that relates to that kind of problem? Or how to check it on the bench?
Some information and clues:
The GC contents of this sequence are high because of M. xanthus.
PCR thermocycling condition (annealing at 60℃), and using Hot Start polymerase to lead a misbinding.
Contamination will lead to that happen or not?
Provided that we have a double helical DNA sequence (5-100 no) with internal Fluorophore and quencher tag in the sequence, is it possible to successfully clone the sequence in a vector?
I want to collect and preserve insects during a 2 weeks field trip, for later extractions of high molecular weight DNA for long-reads DNA sequencing (PacBio HiFi & ONT). What would be the best way to preserve the samples? Thanks a lot. Charles
I am working on a project to classify DNA sequences using Deep Neural Networks (DNN). However, the DNA sequences are of unequal length, and I want the input sequences to be of the same length before converting them into numerical form to prevent an imbalance of training. What is the best way to convert input DNA sequences of variable lengths to equal lengths to increase the accuracy of prediction?
I am planning to finish my PhD at the end of this year and need to express proteins from a DNA sequence. 3 out of 4 DNA sequences could be expressed without problems, but not the fourth. The insert is about 1,500 bp size and is in pet26b(+) vector. In the meantime, I have changed the peptide sequence (or the DNA sequence) three times and have also tried to carry out the expression at 30°C instead of 37°C. I use Bl21 bacteria for expression. Can anyone tell me another bacterial strain and / or expression vector that I could try out? I would be very grateful for further suggestions for improvement and possible solutions.
Thank you very much and best regards,
I have the sequences of mtCOI region and I want to convert it into a colour coded barcode for better presentation. Does anyhave have the idea on how to achieve this?
Your help is appreciated
What would be the cause of a smeared bacterial gDNA band? Could not changing tips between loading a negative control after Loading DNA sample lead to a faint band of same size ( just trying to find the cause of the faint band since no reagent was contaminated)? what can I do to improve my gel.
1 % gel 95 V for 60 minutes 1kb ladder
I have Brassica napus transgenic plant sequences and wild type sequences. I want to see whether they are homozygous or not by looking at chromotograms.
I am trying to sequence a plasmid construction using the primers we used to obtain the insert by PCR. Primer has a Tm of 58. After trying with an annealing Temperature of 50 I obtained the image attached. It seems to work correctly but suddenly the detector saturates for the 4 dyes. It seems there are good peaks just before and after this, but the reaction stops too soon. Any idea of what is happening?
I have reinfected the reactions but I obtain exactly the same
Thanks everyone in advance
I'm using a multistep approach to insert multiple genes in sequence into the pHIV lentiviral vector. I am PCR amplifying the genes from either gblocks from IDT or other plasmids. I use PCR to amplify the genes of interest, confirm their size by agarose gel, and then attempt to assemble them using the NEB HiFi kit (specifically the 2x NEB HiFi master mix). When I submit the assembled plasmid for sequencing I am able to confirm insertion of the gene of interest for that round of cloning. However, the problem I am experiencing is that after several rounds of inserting each gene, I have gone back and attempted to sequence the entire assembled cassette and am noticing there are random sections of gene sequences missing.
Has anyone else seen this? Does anyone have any suggestions for working around it? Is there a better kit available?
My query is that what are the possible ways to convert an AA sequence/nucleotide sequence into a 2D matrix so that it could be considered as an image and could be given as input to a CNN or any other image-based DNN? I have two possible solutions in mind but there are some limitations:
1. One can do it by yielding a matrix from a sequence directly by splitting it into equal rows and columns. But if sequences are of different length, the image dimensions will also vary for all sequences. Is it a good idea here to trim images to mean size or pad them with zero to make all images of the same dimensions?
2. DPC can be computed for sequences that yield a 20x20 matrix but the size is quite small. Also, it comprises information related to frequency, but the position-specific information of AA is lost.
I'm doing my thesis about Promoter regions in some jasmonate genes on cassava (M. esculenta). I have any questions about the DNA cloninig...
Is possible lose some pair bases (bp) of the promoter's sequence?. I mean, If I have my amplified product (promoter) of 1000 pb by PCR (normal), will I get the same sequence of 1000 pb (promoter) on the DNA cloning?
Under what circumstances get lose pair bases of a sequence?
Does get lose some DNA pb being inserted on a plasmid?
On the other hand, occurs the same in the DNA sequencing?
Excuse my english (is my first post) :)
I was looking at differences between two nucleotide sequences by comparing the substitution pattern at each site. p-values for each comparison has been obtained that suggest whether or not two sequences were significantly differ from each other. There are more than a comparison for each species and I am more interested to look at the differences at species level.
As far as I concerned, p-value is more on the significant of a probability, where it tells us that whether or not the comparison was significantly differ from the null hypothesis.
The question is that is it advisable to look at the average p-value from all sequence pair comparison for each species? Will the average p-value depict the heterogeneity of each species?
In my project, we are trying to mutate algae with artificially zeaxanthin accumulated.
I want more meaningful project, so I hope to apply bioinformatics method.
The abstract I think is if I get mutant algae with zeaxanthin accumulation, I will compare some other organism naturally posessing many zeaxanthin with that sequence.
Eventually, I can get which sequence will determine the zeaxanthin accumulation.
But it is my first time to sequence 'DNA', I am not sure how to start it.
Could you recommend me some idea or method about it??
When i submit dna sequences to NCBI, they ask for clarification saying "Some or all of the protein coding sequences contain internal stop codons, reading frame shifts (insertions/deletions based on BLAST similarity search results and/or an alignment), and/or have translations that show little or no similarity to other proteins in the database",
how to solve this problem
Forgive my ignorance...
I have the DNA sequence for the LCVR and HCVR of Adalimumab.
I need a resource to find the DNA sequence LC and HC Constant regions specifically Ig G1(HC) and Kappa (LC).
I tried NIH Nucleotide search but no luck.
If someone could point me in the right direction that would be much appreciated.
Thank you for your time.
I am on the lookout for the Enhanced Yellow Fluorescent Protein (Aequorea victoria) DNA sequence. Does anyone know where I can find it?
Thank you in advance
We use an AB 3130xl Genetic Analyzer for Sanger Sequencing, and on occasion, we see our plasmid samples struggle with wavy, loss of resolution (as seen in the attached image). Our general assumption is that a contaminate is co-migrating during electrophoresis, but I'd like to know if anyone has narrowed it down to more specifics of what those contaminates might be?
I'd like to provide better troubleshooting suggestions to people on what is causing this issue. Any advice/suggestions could be greatly appreciated! Thank you.
They are mentioned in a lot of papers from the 70s to these days in many introductions (not the sequences but the viruses). However I am unable to find any sequences in NCBI.
I suppose they do not exist but I ask just in case they are in another DB or under a different name and I missed them.
Trout and other salmonid taxonomies are still in a chaotic state and in many respects have advanced little since the 19th Century. Salmonids are renowned for their phenotypic plasticity expressed under different environmental conditions. This high plasticity in many morphological characters and life histories is such that almost any population will be found to differ from other populations especially if only a few populations are compared. Yet such characters are the basis of many species descriptions. Some claim to be following the Evolutionary Species Concept (ESC) of Simpson (1951), where “An evolutionary species is a lineage evolving separately from others and with its own unitary role and tendencies”. Evidence for the ESC is provided by morphological differences that are adaptive in nature (my emphasis), i.e., by definition have a genetic basis (Simpson, 1961). Yet many simply assume that the morphological differences that they use have a genetic and adaptive basis without further investigation even though heritability may be extremely low or absent. In that respect their approach is purely phenetic.
Since most conservation legislation is species-based accurate taxonomy is key to conservation of salmonid biodiversity. Bad taxonomy can kill by failing to recognise a population as a distinct taxon and thus it does not receive the conservation attention it requires. On the other, it can result in wasted conservation resources if the taxon is based on purely environmentally-induced differences and is simply part of a more widespread species of lesser concern. Some 51 species of Salmo trouts are currently recognised in FishBase and recent publications, including several in recent years. Most trout species have been classified on colouration, spotting pattern, occurrence of parr marks in adults, dentition, scale counts, and body measurements. In many, but not all situations, these characters are subject to environmental modulation with the effects of phenotypic plasticity and adaptation being difficult to disentangle. Body measurements are, in some cases, converted to ratios of standard length, but this approach has long been regarded as inappropriate due to allometric growth. Often insufficient specimens and populations are examined to give a true picture of intra- and inter-population variability.
An important criterion in taxonomy is that the characters used to define a species can be used to identify individuals to that species with ≥ 99% of individuals being correctly assigned (Mayr, 1963), either using molecular approaches or genetically based life history and morphological differences. Etheridge et al (2012) found that the power of supposedly diagnostic morphological characters to identify individuals of three putative Coregonus species was low (27%) due to the species descriptions being based on a few specimens, and as a result of phenotypic plasticity.
Given that a reference sequence is available for brown trout and that the determination of full genomic sequences is now relatively straightforward, is there any reason why a DNA sequence in an appropriate depository cannot be the name-bearing type sequence for a species? Linked to the type nuclear sequence should be DNA specimens, which can be used for further study. Once isolated, it can be stored nearly indefinitely. DNA can be easily shared for secure, multi-site curation. Since it takes up little space and can be stored at room temp there is no reason why all national museums should not be involved in such curation. Mitochondrial DNA sequences, while much easier to obtain, are problematic due to the potential for horizontal transfer and the linkage of genes. There are several examples of incongruence between nuclear and mtDNA. Use of only part of the nuclear genome could also be potentially problematic due to differentiation between some closely related trout being present in localised genomic ‘islands’. Sufficient DNA sequences to represent intra-specific variability would be required. Clearly international collaboration would be required to cover the entire Salmo trout range and a meaningful number of specimens. Do others consider this a potential way forward and what are the possible difficulties involved? Or is the real question whether conservation legislation should be species-based in the first place but instead be focused on populations, or groups of populations, as in North America using Evolutionarily Significant Units or Designatable Units?
To perform the Luciferase assay for microRNA-gene interaction its necessary a primer (wild-type) to amplificate the specific ligation site and also a mutated primer. But if your primer has a mutated sequence, which will not recognize your DNA part, how to obtain the DNA sequence to insert in the cloning vector?
I have an inducible expression of the enzyme in a newly derived bacterial strain, and methylation may be affected. I want to test methylation efficiency with a bisulfite kit. The kit is the EZ DNA Methylation kit (Zymo Research, D5001).
Do I need to make a special plasmid with a certain sequence for the bisulfite test, or it works more or less equally for any DNA sequence?
I am writing an article about a molecular survey on viral poultry diseases. I want to compare several sequences belonging to the same virus but different in pathogenicity and at molecular level so I need to choose, align, make phylogram for viral sequences isolated by our study and ones published in Genbank. I want to know principles of aligning sequences for e.g. possibility of aligning short and long sequences? When to delete gaps in the sequences?
I have a clone of DNA sequence encoding the full spike protein (S1+S2)
I have a list of Ensembl protein Ids ("ENSP...", got them from PAXdb) and I wish to find their matching dna sequences.
It seems trivial but I didn't find a way to do it...
I could find the appropriate gene Id for each protein and then get the cds nucleotide sequence but it seems inaccurate (because of alternative splicing).
I am preparing the technique for mutation detection. It has many DNA sequences, this gene has 51 exons, and we want to detect whether any mutants exist over the sequences. What is the best approach to do that, in terms of cost and time. Any suggestions will be highly appreciated.
I am trying to sub-clone TCF7 gene (transcript variant 1) into pCDH-CMV-MCS-EF1 Puro vector from some other commercially available vector by InFusion cloning(recombination). I got nice appropriate size specific band during PCR amplification, the positive colonies, I send it for sequencing and when blast I could see the match in DNA sequences. Interestingly, DNA sequences match from start to around 150 bps, then around 200 bps is missing and then rest all match up to Stop Codon. Could you anyone please suggest me why DNA sequences are missing from the inbetween the gene from the clone.
Vector Database - pCDH-CMV-MCS-EF1-Puro
PCR cloning kits allow to clone a PCR product in a variety of vectors through a simple blunt-ended ligation.
However, would this still work with PCR products containing modified ends (obtained with modified oligos such as biotinylated of fluorescin-labeled primers) ? I only need the DNA sequence to be inserted into the vector, not the label.
Recently I faced a problem while separating 3 species of genus Lethe butterflies (group Woodbrowns -Satyrinae) working along an elevation gradient and different forest habitats in Kedarnath Musk Deer Reserve in the Western Himalaya. Based on wing morphology I separated them into 4 species out of 5 known from the region, but examining of male genitalia revealed only 3 species while the rest were all hybrids!!! Can DNA bar coding help in identification of hybrids and their parents?
We have a PCR product that is approximatly 350 pb, and we want to see if there are more DNA sequences withn that product. We decided to use a polyacrilamide gel, but its the first time we ever do one and no one else has done it in the lab before. We don't know what will be the recommended voltage for running it and the approximate time. We did a first try and let it for 16 hrs at 50 mA, since thats what the manual said but it is made for proteins. Any suggestion will be appreciated.
I adjunt the gel image, its messy.
I am lay in this subject. Is DNA sequencing the best way to identify the different types of soil bacteria present in soil samples? Ideally, I'd like to find a technique that allows me to identify all different types of bacteria (from those that could be identified) present in a given soil sample. Thanks.
I would like to design primers that are complimentary or begin in the UTRs (5' and the 3') of a gene? Where exactly and how do I find the coding sequence that includes the UTRS ?