Science topic

Text Linguistics - Science topic

Explore the latest questions and answers in Text Linguistics, and find Text Linguistics experts.
Questions related to Text Linguistics
  • asked a question related to Text Linguistics
Question
12 answers
text Linguistic Contributions to the development of translation studies
Relevant answer
Answer
Please take a look at my work on the poem Tam o' Shanter by Robert Burns, which is very relevant to your question. There is a translation and an article about the translation challenges.
* a new translation of Tam o' Shanter, which you will find on Youtube:
* a monograph on the influence of Old Norse on the poem
* a short article about the translation
* a slide set of the translation
All found here:
  • asked a question related to Text Linguistics
Question
10 answers
I think that text summaries can be considered as a separate genre of text because summary texts have unique stylistic features. For example, narrator change, features such as length and brevity of the text, deleted of detailed information, conjunctions that point to logical connections, discourse markers etc. But the important thing is that the purpose of the discourse, which determines the genre of text, changes. The communicative purpose in narrative texts is not the same as the purpose of the summary of that text. In some studies, summary is considered as a genre of academic text. I'm dealing with genre here in the context of a text schema. Although van Dijk claims that the summary is "reach the great proposition", it does not make it a dependent text just because the topic/content does not change in a text and sticking to the original text. Rather, it uses distinctive linguistic markers. If the situation were only the topic or content of the text, there would be no art or literature.
Relevant answer
Answer
It depends on the text type in some cases your proposition is correct but in some cases, it may not work, for example, medical text, scientific text, and text of similar nature.
  • asked a question related to Text Linguistics
Question
874 answers
Do you know any aphorisms, old sayings, parables, folk proverbs, etc. on science, wisdom and knowledge, ...?
Please, quote.
Best wishes
Relevant answer
Answer
All too often a clear conscience is merely the result of a bad memory.
  • asked a question related to Text Linguistics
Question
3 answers
Could you please tell me what are the best available Arabic speech corpuses for a TTS system? Please include even non free options.
Relevant answer
Answer
Text to speech system, Shatha
  • asked a question related to Text Linguistics
Question
4 answers
I am trying to use Stanford TokensRegex, however, I am getting an error in line number (11). It says that (). Please do your best to help me. Below is my code:
1 String file="A store has many branches. A manager may manage at most 2 branches.";
2 Properties props = new Properties();
3 props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
4 StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
5 Annotation document = new Annotation(file);
6 pipeline.annotate(document);
7 List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
8for(CoreMap sentence: sentences)
9 {
10 TokenSequencePattern pattern = TokenSequencePattern.compile("[]");
11 TokenSequenceMatcher matcher = pattern.getMatcher(sentence);
12 while( matcher.find()){
13 JOptionPane.showMessageDialog(rootPane, "It has been found");
14 }
15 }
Relevant answer
Answer
Java native Regex engine is a horror for every Java developer. Why they include it in the release is a great puzzle.
There is a regex engine I used for the NLP processing from Apache: https://opennlp.apache.org/docs/.
My own favorite is though the regex engine in Python that simply works. However, you have to instantiate the Python engine first before you use regexps http://www.jython.org/jythonbook/en/1.0/JythonAndJavaIntegration.html
  • asked a question related to Text Linguistics
Question
2 answers
Hi ,
I know that most of existing probabilistic and statistical term-weighting schemes (TF-IDF and its variation) are based on linked independence assumption between index terms. On the other hand, semantic information retrieval are seeks the importance of linked dependence between index terms each other.
Please, I am wondering when linked dependence between index terms is vital ? When also can we neglect linked dependence between index terms?
Note: dependence assumption: if two index terms have the same occurrences in the document, this will tend to that index terms are dependent and they should have the same term-weight values. 
Thanks
Osman
Relevant answer
Answer
Hi Vladimir,
Thank you for your answer, but in Information Retrieval, the partially judged document collections  have an issue with relevance judgement values. Thus, I think,  term- weights should have partially semantic relation such as term-weights dependence in unjudged documents. However, the text classification problem has not this issue.
Best wishes,
Osman
  • asked a question related to Text Linguistics
Question
1 answer
Hello!
I'm a student of a university and trying to interpret indicator diagram of internal combustion engine. as a part of this, I have to find average specific heat ratio of gas inside the cylinder. and to do that, I have to find out specific heat at constant pressure(Cp). my professor gave me the three approximate expressions(but only its name.). Danisi(?), Khül, JANAF. (I'm not sure about the 'Danisi' because it's a name of Japanese and I can't find his name in anywhere... so I just transcribe it as pronounced in my language. But since the professor says don't use it because he used it for example so it doesn't really matter... in fact it does matter but I don't have enough time for that now...)
I tried thermodynamics textbooks in the university library, I tried to google it, I tried asking professor(he gave me a textbook but it's all Japanese... and I never learn Japanese...). I found about who is Khül and what is JANAF(and JANAF table...) but I can't find approximate expression about specific heat. So I hope, someone teaches me about it but that's unrealistic... so just, I would very glad if someone gives me a link to a page that explains about it, or where/how to find it.
-In short, I want to find about JANAF approximate expression of specific heat and Khül's approximate expression of specific heat... 
I'll attach an example that my professor offered.
Thank you for your time!
Relevant answer
Answer
The problem you are facing depends on the molecule you are considering. There is no general approximate equation. Anyway for some mixtures databases exists considering piecewise polynomial expression. I attach a report with these quantities for martian atmosphere species. 
Janaf is a collection of data approximated because consider very few levels in the partition function. I also suggest you the book Fundamental Aspects of Plasma chemical Physics: Thermodynamics
Hope it helps
  • asked a question related to Text Linguistics
Question
2 answers
I have some files containing a Persian sentence, a tab and then an English word in each line. The English words show the sentence class. some files have 2 classes, some 3 and some more. I extracted 1000 words from the file and made a term document matrix. The columns of the matrix are the classes and the rows are the words. Now I want to change this matrix to SVD which returns u, sigma and V (Vt) and then do dimension reduction. 1) How can I do that? (I've enclosed the code (python3) but I'm not sure if it's right or no. I copied from the net)
2) when I print the term document matrix, it only returns the start and last lines of the matrix (because it's too large). How can I print all of the matrix?
Then I have to find each word's vector according to u*sigma. 3) How should I make such vector (actually a matrix which is the indexes of each row of u*sigma matrix)?
hint: this a part of LSA project.
Relevant answer
Answer
Hello Vahideh,
If your original matrix is M*N and you want to transform it to M*n, then you can use the following piece of code.
from sklearn.decomposition import TruncatedSVD
svd = TruncatedSVD(n_components=n, n_iter=7, random_state=42)
svd.fit(term_document_matrix)
svd.transform(term_document_matrix)
  • asked a question related to Text Linguistics
Question
1 answer
I have a file containing a Persian sentence, a tab and then an English word. I have to delete stop words and punctuation in the file. I wrote the code in python 3, but because in some words the punctuation attaches to the word, and it is counted as a part of the word and not punctuation, it can't be deleted. So I need to use regular expression to delete stop words. I tried to use that in the code below, but I couldn't. How can I change the code below that it works correctly? (in fact, what should I write exactly?) thanks.
Relevant answer
Answer
In your code you're trying to remove stop words and then delete punctuation from the remaining string.
If this is exactly what you want to achieve I would start with removing punctuation:
some_text.translate(None, string.punctuation)
and then simply:
' '.join([word for word in some_text.split() if word not in StopWords])
  • asked a question related to Text Linguistics
Question
1 answer
I have a list of Persian words and a file which contains a sentence, a tab and then an English word in each line. I want to check if each word in each line of the file, exist in the list, the code returns "1", and if not, it returns "0". For example, if my list contains 20 words and my file has 50 lines, the code should return 50 rows with 20 columns of 1 and 0 and a column of that English word at the end. (In fact 21 columns). And between each number should be a comma (as like as the picture below). And finally I want to write them in a new file. The code below just returns one column. How can I fix it? thanks
Relevant answer
Answer
Try the attached code (tested in Python 2.7).
Best regards,
Mohammad
  • asked a question related to Text Linguistics
Question
2 answers
I deal with his life, work and texts (germain, latin) for several years. I would like compare my results and oppinions with anybody else´s ones.
Keep in mind that he published his works not by his name but by pseudonyms, so if it si your field too, you surely know that :)
Relevant answer
Answer
Dear Susan, thank you very much. This text has showed me (in its) footnote, how differently is Slovak scholar viewed int the scientific world. I send them a message that he was not Austrian writer, as he lived, taught and worked mainly in Slovakia. Very interesting expericnece. Slavka
  • asked a question related to Text Linguistics
Question
3 answers
I am looking text chat data (any kind of call center). If anyone know please provide me link or data. 
Relevant answer
Answer
It depends on your research aim...
Do you need to retrieve content? A good free software for topic detection is Iramuteq: http://www.iramuteq.org/ (you must have R installed on your computer).
Do you need a text classification? cfr. R
  • asked a question related to Text Linguistics
Question
59 answers
The acronym R. S. V. P. ‘Répondez s’il vous plaît’ and other old good  etiquette abbreviations have been existing in the communication since 18th c.
Currently, new CYAL8R ‘see you later’ (seeyalata), IMNSHO ‘in my not so humble opinion’, TGIF ‘Thank God It's Friday’ hit the Internet.
More English abbreviations and ideograms are here:
If you like them, explain, please, why do you prefer to use them instead of full words or phrases.
Relevant answer
Answer
Hi, Napoleon
You are very right, emoticons should not be ambiguous; I hadn't thought of it that way. That is their advantage, a single emoticon should unambiguously make things clear better than many words. It is very realistic, where as poems can be as abstract as an abstract expressionism painting
Narayanan
  • asked a question related to Text Linguistics
Question
10 answers
1/ If we consider a context defining a term as a set of sentences giving necessary information about the meaning of this term, would it be a contextual definition or a definitional context?
2/ Can we find other types of definitions in one definitional context?
Relevant answer
Answer
Contextual definition: a definition of a phenomenon that is context dependent vs Definitional context: a situation or context that is of cardinal importance in the definition of a phenomenon. As can be seen, therefore, though the two are related, they may not mean the same thing: one is in essence A DEFINITION, while the other is A CONTEXT. 
  • asked a question related to Text Linguistics
Question
4 answers
This was a concept regarding health imported to Japan from China between 7th to 10th centuries. 
Yojo connected health with diet, mental control, exercise and sexual restraint. 
I am translating a book related to this topic, however, I can't  find the equivalent Japanese word or Chinese word for it...
Relevant answer
Answer
Xiaonan Julia Huang:  The word "yōjō" in Japanese you are lookng for is written 養生.  Since I do not know the context in which you are trying to use this word, here is a web link you can explore more (in Japanese): < http://dic.search.yahoo.co.jp/dsearch?p=%E9%A4%8A%E7%94%9F&ei=UTF-8&b=1&dic_id=etc&stype=full >. 
You might also be interested in this ようせい【養生 yǎng shēng】Chinese definition: < https://kotobank.jp/word/%E9%A4%8A%E7%94%9F-653073#E4.B8.96.E7.95.8C.E5.A4.A7.E7.99.BE.E7.A7.91.E4.BA.8B.E5.85.B8.20.E7.AC.AC.EF.BC.92.E7.89.88
>.
All the best, YK
  • asked a question related to Text Linguistics
Question
11 answers
does anyone express the relationship between the word order and communication ?
Relevant answer
Answer
If one was able to come up with a formula for the relationship between word order and communication, you would be on the way to winning something like a Nobel prize. You would resolve some very difficult NLP challenges. However, without getting into the deeper problem, lets have a look at what we are talking about. Word order refers to the conventional arrangement of words in a phrase. In other words, it is the sequence of words in a sentence, especially as governed by grammatical rules such as SVO which is a relatively fixed pattern in English (in affirmative sentences). The primary word orders that are of interest are the constituent order of a clause – the relative order of subject, object, and verb; the order of modifiers (adjectives, numerals, demonstratives, possessives, and adjuncts) in a noun phrase; and the order of adverbials. You can then play around with these and see how they impact on meaning. OSV is marked in English "Bread they ate". But you can always contextualise "Bread they ate all the time because there was nothing else to eat".  
  • asked a question related to Text Linguistics
Question
6 answers
I am trying to develop software to get suitable attributes for entities names depending on entity type.
For example if I have entities such doctor, nurse, employee , customer, patient , lecturer , donor, user, developer, designer, driver, passenger and technician, they all will have attributes such as name, sex, date of birth, email address, home address and telephone number because all of them are people.
Second example word such as university, college, hospital, hotel and supermarket can share attributes such as name, address and telephone number because all of them could be organization.
Are there any Natural Language Processing tools and software could help me to achieve my goal. I need to identify entity type as person or origination then I attached  suitable attributes according to the entity type?
I have looked at Name Entity Recognition (NER) tool such as Stanford Name Entity recognizer which can extract Entity such as Person, Location, Organization, Money, time, Date and Percent But it was not really useful.
I can do it by building my own gazetteer however I do not prefer to go to this option unless I failed to do it automatically.  
Any helps, suggestions and ideas will be appreciated.  
Relevant answer
Answer
Mussa,
This might not be a very helpful answer, but from my understanding NLP techniques often rely on context to understand what is being discussed.  So a single word like "doctor" is very difficult to understand unless it is in some kind of context like "a doctor treats sick people".  From the sentence, an NLP machine might recognize that doctor is a noun and might infer something about relating to people.  Without this context, it will be tough to discern the categorical differences between single words. 
It might be less complicated (although more time-consuming) to create a predefined list of terms that you would like to classify and then simply match words to those lists in order to create your associated list of features for a given entity.
Hope that helps.
Sean
  • asked a question related to Text Linguistics
Question
21 answers
Maybe a tool that would also let me annotate parallel texts?
Hi everyone! I'm a linguist having basic computer skills, so I have only some vague notions about Java, Python or other programming languages. I'm interested in annotating a small parallel corpus for discourse relations and connectives, so I need to be able to define several criteria in my analysis (arguments, connectives, explicitness/implicitness, etc.). I would welcome any suggestions... Thanks!
Relevant answer
Answer
Hi Sorina,
I am using SALT for Spanish and English (http://www.saltsoftware.com/). I don't know what languages you need to manage . It is a very user-friendly tool. You can transcript and redefine your own lists of words (concordances) and declare your own tags ([tag]).
You can check also the CHILDES project  tools (http://childes.psy.cmu.edu/).
Hope it helps.
Good luck!
  • asked a question related to Text Linguistics
Question
5 answers
I want to analyze Urdu text linguistically but I couldn't find any software to measure the frequency of different items for the purpose.
  • asked a question related to Text Linguistics
Question
4 answers
There are many tools are used to find out the Part Of Speech (POS) such as Stanford tagger, Tree Tagger and Gate. What is the most common tagger with a lower error rate for English British language?
Relevant answer
Answer
You can download Stanford Core NLP at link below:
and also you can check online by available demo:
  • asked a question related to Text Linguistics
Question
1 answer
I was wondering whether there is a ready to use tool for syntactic normalization of , e.g. noun phrases "treatment of acne" --> acne treatment, etc. Although a rule-based approach is possible, there must be a more robust solution for that.
Relevant answer
Answer
I think Halliday's systemic functional grammar has something about that. I suggest you check that tool.
  • asked a question related to Text Linguistics
Question
5 answers
How can I use it?
Relevant answer
Answer
you have your answer.  i didn't even know there was such a parser.
  • asked a question related to Text Linguistics
Question
3 answers
, i.e., finding the existence and quantity of a set of adjectives from a given set of sentences where the sentences do not contain the adjectives?
Relevant answer
Answer
hello priyanka,
here's the link :
Textual entailment is a directional relation between text fragments. The relation holds whenever the truth of one text fragment follows from another text. In the TE framework, the entailing aframework, the entailing and entailed texts are termed text and hypothesis, respectively. Textual entailment is not the same as pure logical entailment- it has a more relaxed definition.
  • asked a question related to Text Linguistics
Question
3 answers
I would like a code to run Stanford Named Entity Recognizer (NER). Suppose that I have text and I would like the Stanford NER to recognize the entities which are mentioned in the text.
Relevant answer
Answer
You should probably ask your question on the Stanford NER mailing list.
The instructions describe a command line mode, so you don't have to write any code.
If you want to write code, it looks like they have interfaces for many languages, like Python, PHP, C#, etc. Here's a man page showing code in Perl:
There's also an on-line demo but the amount of text it accepts is pretty small.
  • asked a question related to Text Linguistics
Question
1 answer
It will be appreciated if I could have examples with code, tutorial or any other useful resource.
Relevant answer
  • asked a question related to Text Linguistics
Question
4 answers
I am trying to use Stanford TokensRegex to design patterns. I am attempting to catch "A manager may manage at most 2 branches" where it has been mentioned once in the text, however I failed to get it. below is my code
String file="A store has many branches. Each branch must be managed by at most 1 manager. A manager may manage at most 2 branches. The branch sells many products. Product is sold by many branches. Branch employs many workers. The labour may process at most 10 sales. It can involve many products. Each Product includes product_code, product_name, size, unit_cost and shelf_no. A branch is uniquely identified by branch_number. Branch has name, address and phone_number. Sale includes sale_number, date, time and total_amount. Each labour has name, address and telephone. Worker is identified by id’.";
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// read some text in the text variable
// create an empty Annotation just with the given text
Annotation document = new Annotation(file);
// run all Annotators on this text
pipeline.annotate(document);
// these are all the sentences in this document
// a CoreMap is essentially a Map that uses class objects as keys and has values with custom types
List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
for(CoreMap sentence: sentences)
{
TokenSequencePattern pattern = TokenSequencePattern.compile("A manager may manage at most 2 branches");
String sentence1=sentence.toString();
String[] tokens = sentence1.split(" ");
TokenSequenceMatcher matcher = pattern.getMatcher(document.get (CoreAnnotations.SentencesAnnotation.class));
while( matcher.find()){
JOptionPane.showMessageDialog(rootPane, "It has been found");
}
}
Please suggest any books, articles which could help me in learning to design patterns in Stanford TokensRegex within Stanford CoreNLP.
Relevant answer
Answer
Consider the following code
Binding of variables for use in compiling patterns:
Use Env env = TokenSequencePattern.getNewEnv() to create a new environment for binding
Bind string to attribute key (Class) lookup: env.bind("numtype", CoreAnnotations.NumericTypeAnnotation.class);
Bind patterns / strings for compiling patterns
// Bind string for later compilation using: compile("/it/ /was/ $RELDAY");
env.bind("$RELDAY", "/today|yesterday|tomorrow|tonight|tonite/");
// Bind pre-compiled patter for later compilation using: compile("/it/ /was/ $RELDAY");
env.bind("$RELDAY", TokenSequencePattern.compile(env, "/today|yesterday|tomorrow|tonight|tonite/"));
  • asked a question related to Text Linguistics
Question
17 answers
I would like to know which is the best Natural Language Software to recognize the part of speech with small parentage of errors. I have used Stanford CoreNLP but It some time came out with errors.
Relevant answer
Answer
I assume you are referring to English POS tags. In that case, I would suggest Treetager, which seems to be very popular among my colleagues.
  • asked a question related to Text Linguistics
Question
4 answers
CDA framework has been widely used by Chinese linguistic analysts recently. But the real difficulty is that due to the very distinctive political, historical background in China and dichotomy between western and eastern ideology, what CDA scholars generally agree on sometimes does not fit the situation in Chinese society.
So what is the best way to incorporate the CDA framework to Chinese issues and at the same time be truthful to the indigenous environment, so to be really socio-historically significant?
Relevant answer
I think you are mixing between power relations and ideological polarisation even though they are both addressed by CDA scholars as I did in my PhD thesis. Perhaps one of my papers here and another one which will be published very soon in an open access journal can give some ideas. You also need to be familiar with CDA approaches by referring to Fairclough, van Dijk and Wodak. I designed an approach to use it when I studied the Iraq war 2003 discourse. My book on Amazon.come can also help you if you can find it in your university library because it is expensive in the market. 
  • asked a question related to Text Linguistics
Question
4 answers
I would like to extract attributes of a table which are mentioned in Plain text. What is a best approach to be followed : Is it supervised, semi-supervised or unsupervised ?
I have some sample case studies but I have not had big training set.
below is example of a case study:
"Consider the following relational database for Fester Zoo. Fester Zoo wants to maintain information about its animals, the enclosures in which they live, and its zookeepers and the services they perform for the animals. In addition, Fester Zoo has a program by which people can be sponsors of animals. Fester Zoo wants to track its sponsors, their dependents, and associated data. Each animal has a unique animal number and each enclosure has a unique enclosure number. An animal can live in only one enclosure. An enclosure can have several animals in it or it can be currently empty. A zookeeper has a unique employee number. Every animal has been cared for by at least one and generally many zookeepers; each zookeeper has cared for at least one and generally many animals. Each time a zookeeper performs a specific, significant service for an animal the service type, date, and time are recorded. A zookeeper may perform a particular service on a particular animal more than once on a given day.
A sponsor, who has a unique sponsor number and a unique National Insurance number, sponsors at least one and possibly several animals. An animal may have several sponsors or none. For each animal that a particular sponsor sponsors, the zoo wants to track the annual sponsorship contribution and renewal date. In addition, Fester Zoo wants to keep track of each sponsor’s dependents. A sponsor may have several dependents or none. A dependent is associated with exactly one sponsor."
Any books or online resources are appreciated.
Relevant answer
Answer
If you have access, start with some very fundamental ones like this one by R. Grishman (Information Extraction Techniques and Challenges), or R. Grishman's notes for the Tarragona winter school in 2012 which present the domain very well (link below). The latter provides a good bibliography section, additionally.  I believe this is a good starting point, instead of a long list of specific papers. 
If you need more information on a specific topic covered (or not) by these introductions, don't hesitate to follow up with more questions. 
  • asked a question related to Text Linguistics
Question
3 answers
I need to extract all words after the following pattern "/[Ee]ach/ ([tag:NN]|[tag:NNS]) /has|have/ /\\w|[ ]|[,]/" until the end of the sentence but I am getting unexpected output:
in the second sentence I am getting: "Each campus has a" where the right output is "Each campus has a different name, address, distance to the city center and the only bus running to the campus " 
in the third sentence I am getting  "Each faculty has a " where the right output is " Each faculty has a name, dean and building "
in the fourth sentence the pattern is unable to match the right output which is " each problem has solution, God walling"
It will be appreciate if you could help me in solve this problem, I think that there my pattern has not been written correctly , below is my code
String file="ABC University is a large institution with several campuses. Each campus has a different name, address, distance to the city center and the only bus running to the campus.  Each faculty has a name, dean and building. this just for test each problem has soluation, God walling.";
  Properties props = new Properties();
  props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
  StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
  Annotation document = new Annotation(file);
  pipeline.annotate(document);
  List<CoreLabel> tokens = new ArrayList<CoreLabel>();
  List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
  for(CoreMap sentence: sentences)
   {          
    for (CoreLabel token: sentence.get(CoreAnnotations.TokensAnnotation.class))
            tokens.add(token);
    TokenSequencePattern pattern = TokenSequencePattern.compile("/[Ee]ach/ ([tag:NN]|[tag:NNS]) /has|have/ /\\w|[ ]|[,]/");
    TokenSequenceMatcher matcher = pattern.getMatcher(tokens);
    while( matcher.find()){
        JOptionPane.showMessageDialog(rootPane, matcher.group());
     }
     tokens.removeAll(tokens);
   }
Relevant answer
Answer
You can also forget the Stanford Tokens and try something like: "[Ee]ach .* ha(s|ve).*\." <-that's will return the first three sentences. 
  • asked a question related to Text Linguistics
Question
15 answers
Our large SMS corpus in French (88milSMS) is available. User conditions and downloads can be accessed here: http://88milsms.huma-num.fr/
Is there a website that list all corpora available for NLP and text-mining communities?
Relevant answer
Answer
Hello,
Thanks Ali for the pointer. We can indeed help you share it with the HLT community and give it some further visibility at ELRA/ELDA (http://www.elra.info and http://www.elda.org). You can have a look at our ELRA Catalogue (http://catalog.elra.info/) and the Universal Catalogue (http://universal.elra.info/) and get in touch with us for any further information (http://www.elda.org/article.php?id_article=68). We'll be happy to help! Kind regards, Victoria.
  • asked a question related to Text Linguistics
Question
7 answers
Any electronic resources include books, example, tutorial are appreciated.
Relevant answer
Answer
A simplified definition of a token in NLP is as follows: A token is a string of contiguous characters between two spaces, or between a space and punctuation marks. A token can also be an integer, real, or a number with a colon (time, for example: 2:00). All other symbols are tokens themselves except apostrophes and quotation marks in a word (with no space), which in many cases symbolize acronyms or citations. A token can present a single word or a group of words (in morphologically rich languages such as Hebrew) as the following token "ולאחי" (VeLeAhi) that includes 4 words "And to my brother".
A stirng as written by one of the previous researchers who responded
is a oncept taken from programming languages.
  • asked a question related to Text Linguistics
Question
9 answers
I need clauses or phrases from a sentence.
Relevant answer
Answer
There's an online demo available here
  • asked a question related to Text Linguistics
Question
3 answers
There is lot of literature on genre evolution/transformation and the internet in general, and I've found some linguistic and discourse/communication oriented literature on on-line book and movie reviews, but - up to now - very little work dedicated specifically to the more recent product or consumer reviews dedicated to all kinds of objects, from cell phones to travel destinations, and published e.g. on thematic websites and the connected forums (where users often post reviews or review fragments, in some cases mixed with other kinds of posts). Has anyone come across text linguistic or discourse analytical work about these genres?
Relevant answer
Answer
That would be very helpful, thanks a lot.
  • asked a question related to Text Linguistics
Question
1 answer
If there is any, what is the underlying technology?, i.e is it formant based, unit selection based, concatenative etc.?
Relevant answer
Answer
You may ask Anthony Beaumont , he published some work on this and my know more about Commercial products:
Intonation contour realisation for Standard Yoruba text-to-speech synthesis: A fuzzy computational approach. Computer Speech and Language, in press.
A Fuzzy Decision Tree-based Duration Model for Standard Yoruba Text-To-Speech Synthesis. Computer Speech and Language, in press.
A Computational Model of Intonation for Yoruba Text-to-Speech Synthesis: Design and Analysis. In the Proceedings of the Seventh International Workshop on Text, Speech and Dialogue (TSD 2004), Brno, Czech Republic, 2004-09-08.