Are you K. Smaili?

Claim your profile

Publications (6)0 Total impact

  • Conference Proceeding: Improving language models by using distant information
    A. Brun, D. Langlois, K. Smaili
    [show abstract] [hide abstract]
    ABSTRACT: This study examines how to take originally advantage from distant information in statistical language models. We show that it is possible to use n-gram models considering histories different from those used during training. These models are called crossing context models. Our study deals with classical and distant n-gram models. A mixture of four models is proposed and evaluated. A bigram linear mixture achieves an improvement of 14% in terms of perplexity. Moreover the trigram mixture outperforms the standard trigram by 5.6%. These improvements have been obtained without complexifying standard n-gram models. The resulting mixture language model has been integrated into a speech recognition system. Its evaluation achieves a slight improvement in terms of word error rate on the data used for the francophone evaluation campaign ESTER [1]. Finally, the impact of the proposed crossing context language models on performance is presented according to various speakers.
    Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on; 03/2007
  • Source
    Conference Proceeding: A comparative study of topic identification on newspaper and E-mail
    [show abstract] [hide abstract]
    ABSTRACT: Not Available
    String Processing and Information Retrieval, 2001. SPIRE 2001. Proceedings.Eighth International Symposium on; 12/2001
  • Source
    Conference Proceeding: Experiment analysis in newspaper topic detection
    A. Brun, K. Smaili, J.-P. Haton
    [show abstract] [hide abstract]
    ABSTRACT: We present several methods for topic detection on newspaper articles, using either a general vocabulary or topic-specific vocabularies. Specific vocabularies are determined manually or statistically. In both cases, we aim at finding the most representative words of a topic. Several methods have been experimented, the first one is based on perplexity, this method achieves a 100% topic identification rate, on large test corpora, when the two first propositions are taken into account. Other methods are based on statistical counts and achieve 94% of identification on smaller test corpora. The major challenge of this work is to identify topics with only few words in order to be able, during speech recognition, to determine the best adequate language model
    String Processing and Information Retrieval, 2000. SPIRE 2000. Proceedings. Seventh International Symposium on; 02/2000
  • Conference Proceeding: An anti-blocking control policy for tandem queueing networks
    J.-C. Hennet, K. Smaili
    [show abstract] [hide abstract]
    ABSTRACT: Blocking phenomena may appear in any queueing network with limited capacity queues. We propose a simple admission control policy, to decrease the risks of blocking which deteriorate the system performance. Under classical Markov assumptions, the controlled system is exactly modelled in the case of two tandem queues, and approximately modelled for more than two queues. The quality of the approximate analytical model is then assessed by comparison with simulation results. It is established that in most cases, the performance of the controlled system is much higher than that of the uncontrolled system
    Emerging Technologies and Factory Automation, 1995. ETFA '95, Proceedings., 1995 INRIA/IEEE Symposium on; 11/1995
  • Source
    Conference Proceeding: Which model for future speech recognition systems: hidden Markov models or finite-state automata?
    [show abstract] [hide abstract]
    ABSTRACT: Representation of speech using hidden Markov models is compared with a representation based on finite-state automata. Experimental results are presented for both methods and the respective advantages and disadvantages discussed
    Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on; 05/1994
  • Article: Statistical methods in multi‐speaker automatic speech recognition
    [show abstract] [hide abstract]
    ABSTRACT: Automatic speech recognition and understanding (ASR) plays an important role in the framework of man-machine communication. Substantial industrial developments are at present in progress in this area. However, after 40 years or so of efforts several fundamental questions remain open. This paper is concerned with a comparative study of four different methods for multi-speaker word recognition: (i) clustering of acoustic templates, (ii) comparison with a finite state automaton, (iii) dynamic programming and vector quantization, (iv) stochastic Markov sources. In order to make things comparable, the four methods were tested with the same material made up of the ten digits (0 to 9) pronounced four times by 60 different speakers (30 males and 30 females). We will distinguish in our experiments between multi-speaker systems (capable of recognizing words pronounced by speakers that have been used during the training phase of the system) and speaker-independent systems (capable of recognizing words pronounced by speakers totally unknown to the system). Half of the corpus (15 male and 15 female) were used for training, and the remaining part for test.
    Applied Stochastic Models and Data Analysis 08/1990; 6(3):143 - 155.