Vaughan R. Shanks's research while affiliated with RMIT University and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (5)
Automatic categorisation is an important technique for the management of large document collections. Categorisation can be used to store or locate documents that satisfy an information need when the need cannot be expressed as a concise list of query terms. Inverted indexes are used in all query-based retrieval systems to allow efficient query proc...
Categorisation is a useful method for organising
documents into subcollections that can be browsed or
searched to more accurately and quickly meet information
needs. On the Web, category-based portals such as Yahoo!
and DMOZ are extremely popular: DMOZ is maintained by over
56,000 volunteers, is used as the basis of the popular
Google directory, an...
This is RMIT's first year of participation in the TDT evaluati on. Our system uses a linear classifier to track topics and an approac h based on our previous work in document routing. We aimed this year to develop a baseline system, and to then test selected variati ons, in- cluding adaptive tracking. Our contribution this year to ha ve im- plement...
Citations
... Zusätzlich zum Inverted-Index speichern wir statistische Informationen bezüglich der Terme wie document frequency, inverse document frequency [51]. Abbildung 3.6 zeigt eine Tabelle aus diesen Informationen mit und ohne Lemmatisierung, Dekomposition und Stoppwort-Entfernung. ...
... Sparse inference Earlier research has applied inverted indices for reducing the classification times for Knearest Neighbours [Yang, 1994] and Centroid [Shanks et al., 2003]. The same reductions are gained for computing posterior probabilities for linearly interpolated language models in information retrieval [Hiemstra, 1998, Zhai andLafferty, 2001b]. ...
... The classification was actually conducted on the summaries of the web text documents which are organized in word-based approach. Shanks and Williams used only the first fragment of each document for their classification task [19]. However, this approach only works well for documents which present overview of the whole document at the beginning. ...