The possibility of utilizing the doze procedure as a measure of ESL (English as a Second Language) proficiency has recently aroused considerable interest. Studies by Darnell (1968), Bowen (1969), Kaplan and Jones (1970), Oller and Conrad (1971), and Oller and Inal (1971) have demonstrated that the cloze method has merit, but several important questions are yet unanswered. Among them are the matters of scoring and level-of-difficulty, and their respective contributions to the effectiveness of cloze tests as measures of ESL (English as a Second Language) proficiency. A still further and possibly more important question concerns the nature of the cloze task and the skills involved in performing it. Previous research has shown repeatedly that the best and most convenient method for scoring when native speakers are tested is simply to count the number of exact-words restored to the context (Taylor, 1953, Rankin, 1957, Ruddell, 1963, Bormuth, 1965). Although native speakers tend to get higher mean scores when acceptable substitutes are counted as correct, the increase in total test variance is so slight that the extra effort involved is scarcely worthwhile. It is considerably simpler to ask for each fill-in, "Does it match the original word?," than it is to ask, "Is this response contextually acceptable?" Moreover, scorers are likely to be less reliable in the latter case. In spite of all this, researchers who have experimented with the cloze method as a measure of second-language proficiency have often preferred scoring systems that give credit for contextually acceptable responses. Some have even gone so far as to give partial credit for responses which, though clearly incorrect, indicate some measure of comprehension. Darnell (1968) scored responses on given items on the basis of native speaker responses for those same items. Bowen (1969) weighted responses according to their degree of correctness, subjectively determined. Oller and Inal (1971) counted any contextually acceptable response as correct. Since it has been clearly established that allowing contextually acceptable responses in addition to exact-word fill-ins makes little difference with native speakers, why should we expect things to be different when non-natives are tested? There are several reasons. One is that the exact-word scoring criterion may create a cloze test that is simply too difficult for non-natives even though it may not be for natives. Also, there is something intuitively unsettling about requiring a non-native speaker to guess the exact-word in order to receive full credit for an answer. Suppose, for example, that an item reads, "the --went down to the stream." If the exact-word is, say, "child," is it reasonable to class "horse," "dog," "animal," etc., along with clearly incorrect fill-ins like "of," "and," "table," etc.? The task of guessing the exact-word is not necessarily a language skill in the ordinary sense of the term.