Science topic
Language Documentation - Science topic
Explore the latest questions and answers in Language Documentation, and find Language Documentation experts.
Questions related to Language Documentation
i wanna place all same documents at one place to save storage? when user want that document then it will translate into user's language?
is good to do it?
I am looking for a syllabus/textbook that I can use to teach a new course on discourse and pragmatics, which can cover the following:
1) introducing the basics of pragmatic theory
2) introducing the basics of discourse analysis
3) covering corpus-driven analysis of discourse, preferably complemented by the use of some user-friendly free software for English discourse analysis
4) broadening into some other related discussions such as inferential/cognitive pragmatics, CDA, literary stylistics, CA and rhetoric, and possibly special DA for language documentation and archiving.
would very much like to hear about others' experience and suggestions
I am a Chinese scientific researcher. There are many foreign language readings, but I don't always get much. Is there a good way to collect and read foreign language documents more quickly, especially research on agricultural industrialization and small-scale farmer households? I hope you have enlighten me. Thank you.
Hi everyone,
I need to perform a topic analysis on various corpora of documents and I need a procedure that can be applied to all of these corpora independently in a standard way.
These are the characteristics of the corpora:
- the number of documents in each corpus will hardly be more the 500 and most of the times is around 50;
- documents are generally very shot (from 20 to 200 words most fo the times);
- each corpus is independent and analyses will never be done merging corpora, but only performed within each corpus;
- the language of documents will be homogeneous within each corpus, but it may vary between corpora;
- the number of topics is unknown a priori, and topics will be different in every corpus.
Specifically, I’m looking for a procedure that:
- automatically detects the best number of recurrent topic in each corpus, but that it is also able to take into account that some documents may have “peculiar” topics that are not represented in any other document. These are not of interest and may be seen as a kind of “residuals”. If these peculiar, single-document topics are identified as further topics by the model it is fine too;
- gives for every document a % for all the identified recurrent topics, plus a % that is “residual” from them. Otherwise, also the single-document topics have to be identified and scored in each document.
if I understand the LDA models well, they don’t allow this “residual” part and the sum of the %-score of the topics is always 1. Moreover, they are not good in identifying single-document topics and the result for these “outcast” documents is somehow a uniform score for all the topics, even though none of them is truly present in the document.
Are there other topic analysis models that better fit with my task or I misunderstood the LDA models?
Thank you very much!
Massimiliano
I am undertaking a translation of the Shawnee New Testament and am using a 1929 orthography instituted by Thomas Wildcat Alford in translating the four gospels. Research shows that other orthographies have been used in the past but have not been universally accepted by the Shawnee. The language is currently spoken by an estimated 200 people. Those of Shawnee heritage number more than 14,000 and many would appreciate this translation in that it could help to revitalize this endangered language. I am interested in any former research or strategies to be followed to make their dream come true. Can you help?