Dominique Haughton

Dominique Haughton
Bentley University · Department of Mathematical Sciences

PhD

About

130
Publications
34,121
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,463
Citations
Citations since 2017
16 Research Items
824 Citations
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
Additional affiliations
September 2013 - present
Université de Paris 1 Panthéon-Sorbonne
Position
  • Affiliated researcher SAMM
January 2005 - present
Toulouse 1 Capitole University
Position
  • Affiliated Researcher GREMAQ

Publications

Publications (130)
Article
This study estimates the causal effect of Rwanda’s unconditional cash transfer program (VUP-Direct Support) on the incidence of poverty, the poverty gap, and household food and non-food expenditure for direct support recipients. Our empirical analysis applies four matching methods to data from the 2013/14 household survey in order to estimate the p...
Chapter
France has a long tradition of using statistical (choropleth) maps, which use shading to represent the spatial distribution of a variable, such as population, by department. Such maps lead the observer to underestimate the importance of urban areas, especially Paris. A solution that complements the choropleth map is to create a cartogram, which del...
Book
Business analytics is the application of statistical and quantitative analysis, as well as formal modeling, to decision making. This book examines under what circumstances and with which techniques one can reasonably infer cause and effect in a business setting and use the insight to drive business decisions. The book is rooted in realistic and imp...
Chapter
This paper analyzes financial ratios of 27 consumer discretionary firms listed on the S&P 500 over an eleven-year period from 2006–2016. It adopts a two-step approach wherein first a confirmatory factor analysis (CFA) on the financial time-series is conducted and the resulting constructs’ scores are then used to perform a cluster analysis using sel...
Article
Full-text available
Purpose: Data analytics techniques can help to predict movie success, as measured by box office sales or Oscar awards. Revenue prediction of a movie before its theatrical release is also an important indicator for attracting investors. While measures for predicting the success of a movie in box office sales and awards are widely missing, this study...
Chapter
This chapter discusses to which extent modern analytics techniques can help us understand the success of movies, as measured by their box office or attributed Oscars. Interesting lessons emerge from our analyses. Predicting box office revenue on the basis of data available before the release of the movie remains difficult, even with state-of-the-ar...
Article
This paper employs a Directed Acyclic Graph (DAG) to investigate direct and indirect effects among the proportion of household expenditure spent on alcohol, the proportion of household expenditure on tobacco, the proportion of household expenditure on gambling, and fourteen demographic factors from a socio-economic survey of 43,844 Thai households...
Article
Full-text available
This paper presents a study of household tobacco consumption in Thailand from 2006 to 2011. We investigated the nonlinear relationships between this behavior and household alcohol expenditure, household gambling expenditure, and demographic factors and used TreeNet to analyze datasets drawn from socio-economic surveys. Across all the years included...
Conference Paper
This paper uses SOMbrero visualizations to examine two socioeconomic dimensions of European states, generated by a factor analysis of time-series data from 2001-2013. We analyze SOMs for 41 countries with regard to " Old Capital " and " New Capital " , two factors that are generated from 12 variables. SOMbrero reveals evidence of various convergenc...
Article
Full-text available
This article uses self-organizing maps (SOMs) to examine convergence between European states, giving special attention to the states of Central and Eastern Europe (CEE) that joined the European Union (EU) during its monumental expansion in 2004. To augment the literature on income convergence, the robust conceptual framework employed here is based...
Article
Full-text available
Background In this paper, we investigate how household alcohol consumption in Thailand relates to the age of the head of household. Methods We use datasets drawn from socio-economic surveys of Thai households conducted during the period of 2006–2011, and we use Treenet, a data-mining technique, to investigate nonlinear relationships between respon...
Chapter
The term data mining, AKA Analytics, Data Science or “Big Data” refers to the identification – within a typically large database – of new, valid, and interesting patterns. While data mining is very popular in the context of, for example, database and web marketing, most of the methods under the data mining umbrella have been widely applied in biost...
Article
The digital divide, while attracting considerable research and political attention since the introduction of the Internet, remains an apparently intract able issue for parts of the developing world. While most prior research focuses on national income as the most significant predictor of digital divide and thus leaves little room for intervention w...
Chapter
Traditionally, the debate surrounding development has focused mainly on the prevalence and effects of poverty and minimal economic growth (Chen and Ravallion, 2008). Some studies have also attempted to give explanations for this present situation of poverty and low economic growth. In all, a plethora of studies exist covering such issues as aid ver...
Article
Purpose – The purpose of this paper is to propose data mining techniques to model the return on investment from various types of promotional spending to market a drug and then use the model to draw conclusions on how the pharmaceutical industry might go about allocating promotion expenditures in a more efficient manner, potentially reducing costs...
Conference Paper
Full-text available
This paper proposes an approach for comparing interlocked board networks over time to test for statistically significant change. In addition to contributing to the conversation about whether the Mizruchi hypothesis (that a disintegration of power is occurring within the corporate elite) holds or not, we propose novel methods to handle a longitudina...
Chapter
This chapter describes and contrasts two main approaches to visualizing the very large actor co-starring network. Technical details on how to construct the visualizations are provided and memory problems discussed. The chapter demonstrates a successful use of k-core techniques for visualizing large networks.
Chapter
This chapter gives a road map of the topics discussed in the monograph and briefly introduces what is meant by “Movie Analytics”.
Chapter
This chapter briefly introduces the terms Big Data, Analytics and Data Science.
Chapter
In this chapter we examine the role of prediction markets in evaluating the probability of a nominated motion picture receiving an Academy award. We illustrate the issue with the best picture award in 2013.
Chapter
In this chapter, we focus the attention on whether text reviews of movies which are nominated for a Best Picture award carry any sign of the likelihood of a movie winning the award. We suggest that a measure of how controversial the movie is perceived to be, the value of which could be extracted by a text analysis of the reviews, is a potential pre...
Chapter
This chapter demonstrates how to analyze longitudinal data on weekly attendance during several years in eight different movie theaters in France, all located in small to medium sized cities in the South West part of France (the movie theaters considered in this study have from 1 to 4 rooms). Necessary R code is included and discussed.
Book
Movies will never be the same after you learn how to analyze movie data, including key data mining, text mining and social network analytics concepts. These techniques may then be used in endless other contexts. In the movie application, this topic opens a lively discussion on the current developments in big data from a data science perspective. Th...
Article
Full-text available
Biological thought increasingly recognizes the centrality of the genome in constituting and regulating processes ranging from cellular systems to ecology and evolution. In this paper, we ask whether genomics is similarly positioned as a core concept in the instructional sequence for undergraduate biology. Using quantitative methods, we analyzed the...
Article
Full-text available
This paper proposes data mining techniques to model the return on investment from various types of promotional spending to market a drug and then uses the model to draw conclusions on how the pharmaceutical industry might go about allocating marketing expenditures in a more efficient manner, potentially reducing costs to the consumer
Article
Full-text available
We demonstrate on a case study with two competing products at a bank how one can use a Hidden Markov Chain (HMC) to estimate missing information on a competitor's marketing activity. The idea is that given time series with sales volumes for products A and B and marketing expenditures for product A, as well as suitable predictors of sales for produc...
Conference Paper
Full-text available
The Advances in Teaching and Learning Technologies Mini-track has a history at HICSS that spans more than seventeen years. Various incarnations of this mini-track have served as an outlet for researchers who investigate the collaborative aspects of teaching, ...
Article
Full-text available
page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any re...
Conference Paper
Many of the 1.5 billion users of social media send or receive messages or post or access new content every day. The personal, social, political and business implications of this volume of activity are profound. Extracting useful information from this activity presents a challenge for scientists, technologists and also scholars in other disciplines....
Article
Full-text available
This paper introduces Kohonen Self-Organizing Maps (SOMs) to the scholarly discussion of the United Nations (UN)’s Millennium Development Goals (MDGs). We use data through the MDGs’ approximate mid-point (2000-2008) to analyze three world regions: Africa, Asia and Latin America. We observe a handful of countries that showcase noteworthy progress, i...
Chapter
Business analytics, in simple terms, is data analysis applied to business problems. While its origins and history are closely tied tofinance and other data-intensive areas of business, in recent years business analytics have moved into many more areas of corporate and social life. Along with this trend has come a closer connection to the traditiona...
Article
Even though social networking sites are very popular all around the globe, social networks for professionals have not received much attention from the scientific community. We study how physicians interact with each other from the perspective of network analysis. In our study, we treat each physician as a node, and the link between them represents...
Conference Paper
Using RFID-based Real-time location systems to describe and understand social networks in the outpatient setting Purpose: All technology change is culture change. In order to better understand and facilitate change in health care we need to better understand the complex social structure in clinical systems. Social network analysis can help us do...
Article
Full-text available
The increasing competition for graduate students among business schools has resulted in a greater emphasis on graduate business student retention. In an effort to address this issue, the current article uses survival analysis, decision trees and TreeNet® to identify factors that can be used to identify students who are at risk of dropping out of a...
Chapter
It is rarely sufficient simply to present estimates of means, coefficients, or poverty rates that have been calculated based on survey data. We also need measures of the variability of these measures, so that we may judge how much confidence to have in them.
Chapter
Economists and statisticians are rediscovering geography. Until relatively recently, most economic models essentially ignored spatial variations in data and in relationships; these were not at the heart of the issues that were considered to be interesting.
Chapter
Most household survey data come from a single cross-section of households surveyed at a single point in time. This is useful if the purpose is to get a snapshot of income or poverty, and it does allow for a detailed analysis – for instance, of the proximate determinants of health or malnutrition or income. However, it is rarely possible to get an a...
Chapter
It is tempting, but wrong, to believe that graphical techniques have little to offer for serious researchers in economics, statistics, or policy analysis. Their true power comes from the ability of the eye to discern patterns in a graph that are not clearly evident from lists of numbers or tabulated statistics. In Tufte’s pithy phrase, “graphics re...
Chapter
Full-text available
In the classical (or frequentist) approach to statistical methods, the analyst uses a sample of data to make inferences about the value of fixed but unknown population parameters. Among other things, this allows one to construct confidence intervals. Suppose, for example, we wish to estimate a proportion p, say a poverty rate, from a simple random...
Chapter
Well over 200 years ago, Adam Smith wrote his classic An Inquiry into the Nature and Causes of the Wealth of Nations. Of course interest in causality goes back much further: Democritus, the pre-Socratic “laughing philosopher,” wrote, “I would rather discover one causal law than be King of Persia.”
Chapter
Household survey data are generated by sampling, and cannot be interpreted successfully unless the sampling has been done correctly.
Chapter
Household surveys can provide a great deal of information about incomes, spending, crops grown, and other household and individual characteristics. This detail comes at a cost: given the expense of surveying each household, the number of households sampled is typically fairly modest, rarely exceeding 10,000. Samples of this size are adequate for es...
Chapter
We are often interested in modeling the time that elapses between one event and another – for instance, between one birth and the next, between a medical treatment and recovery, or between losing a job and finding the next one. Duration models are concerned with describing and explaining these spells.
Chapter
Full-text available
A government sets up a scheme for extending microcredit to farmers; or builds an irrigation canal; or provides free textbooks to 10-year-olds; or introduces supplemental nutrition for pregnant mothers; or strengthens the social security net with a food-for-work program.
Chapter
We are often interested in grouping observations. Whenever we report statistics broken down by expenditure quintile, or by region, or by household size, we are gathering observations into clusters. The purpose is to help make more sense of the data, to create more order out of a potentially chaotic mass of information.
Chapter
Nearly all regression analysis begins by estimating a linear model of the form:$$ \begin{array}{ll} {y_i} = {\beta_0} + {\beta_1}{x_{{i1}}} + {\beta_2}{x_{{i2}}} + \cdots + {\beta_k}{x_{{ik}}} + {\varepsilon_i} \\={{{\mathbf{x\prime}}}_i}\beta + {\varepsilon_i}, \end{array} $$ (4.1)where \( \beta = ({\beta_0},{\beta_1}, \ldots, {\beta_k})\prime \)...
Chapter
The measurement of poverty and inequality is surprisingly intricate. The purpose of this chapter is to provide a self-contained overview of the issues that arise when trying to measure poverty. The virtue of this chapter is concision; for more extensive treatments, one might start with the Handbook by Haughton and Khandker ( 2009), or the classic e...
Chapter
This chapter reviews the essentials of regression analysis. For most readers it will be a refresher that can be skimmed quickly; it provides a concise, self-contained coverage of topics that are the staple of any good course on econometrics.
Article
Full-text available
This paper describes and compares three clustering techniques: traditional clustering methods, Kohonen maps and latent class models. The paper also proposes some novel measures of the quality of a clustering. To the best of our knowledge, this is the first contribution in the literature to compare these three techniques in a context where the class...
Article
Full-text available
L'exposé présente une méthode pour obtenir des estimateurs pour petites régions dans le contexte des Enquêtes sur le Niveau de Vie au Vietnam. On introduit brièvement ces enquêtes, puis on rappelle les concepts principaux en estimation pour petites régions, notamment l'utilisation de données auxiliaires, et on contraste les modèles simples avec ceu...
Chapter
There has been a considerable growth in interest in network analysis. Air transportation networks are regarded as complex networks which are full of dynamics and complexity. This study focuses on the US air transportation network, which is one of the most diverse and dynamic transportation networks in the world. All of the data are drawn from the U...
Article
Full-text available
This paper proposes to investigate inequality in Viet Nam from the point of view of a study of the urban/rural gap by means of a multilevel model. Using data from the Viet Nam Household Living Standards Survey of 2002, the paper constructs a multilevel model, yielding random effects in the urban/rural gap which can be seen as location-specific rand...
Article
Purpose – The purpose of this paper is to present a new framework to examine the adoption of virtual worlds. Virtual worlds, defined as internet‐based simulated environments that emulate the real world and are intended for users to inhabit and interact within them through avatars, are growing fast and are attracting more and more users. Design/met...
Article
Full-text available
This article reviews three software packages that can be used , is a product of Statistical Innovations whereas MCLUST and poLCA are packages written in R and are available through the web site http:// www.r-project.org. We use a single dataset and apply each software package to develop a latent class cluster analysis for the data. This allows us t...
Article
Full-text available
The need to pre-specify expected interactions between variables is an issue in multiple regression. Theoretical and practical considerations make it impossible to pre-specify all possible interactions. The functional form of the dependent variable on the predictors is unknown in many cases. Two ways are described in which the data mining technique...
Conference Paper
A virtual world is an Internet-based simulated environment which users inhabit and where they interact via avatars. As a new information technology innovation, it has developed very fast and has brought about economic success for organizations and individuals. Very limited research has been done to date to investigate virtual worlds. This paper pre...
Article
This case study describes efforts to promote collaborative research across traditional boundaries in a business‐oriented university as part of an institutional transformation. We model this activity within the framework of social network analysis and use quantitative tools from that field to characterize resulting impacts.
Article
Full-text available
Several variations are given for an algorithm that generates random networks approximately respecting the probabilities given by any likelihood function, such as from a p* social network model. A novel use of the genetic algorithm is incorporated in these methods, which improves its applicability to the degenerate distributions that can arise with...
Article
Full-text available
Purpose To provide insights into the experience of women aspiring to the CEO position, particularly regarding qualifications and compensation expectations. Design/methodology/approach The ExecuComp database of executives at 1,500 large US corporations from 1992 to 2004 was used to identify women CEOs and to examine gender differences in compensati...
Article
This paper reports the results of research investigating the determinants of the propensity to switch wireless service providers. A model generated from the data rather than from a priori theory is presented, and it is found to uphold the strong relationship between customer satisfaction and customer loyalty exhibited in prior studies. In sharp con...
Article
Full-text available
This study examines the disparities in living standards between and among the different ethnic groups in Vietnam. Using data from the Vietnam Living Standards Surveys and 1999 Census, we show that 'majority' Kinh and Hoa households have substantially higher living standards than 'minority' households from Vietnam's 52 other ethnic groups. While the...
Article
Full-text available
With the help of a Kohonen self-organising algorithm, this paper presents a mapping and analysis of the global digital divide along with its main drivers. Several broad groups and subgroups are identified, consisting of countries that are similar in their digital development and in a number of other attributes. We find that the digital divide seems...
Article
Full-text available
The deepening of the digital divide between countries has prompted international organizations and governments to work together toward reducing the problem over the next 15 years. However, such efforts will likely succeed only if they are based on a firm grasp of the divide’s underlying causes. In this paper we report the results of a comprehensive...
Article
Full-text available
This article offers a review of three software packages that estimate directed acyclic graphs (DAGs) from data. The three packages, MIM, Tetrad and WinMine, can help researchers discover underlying causal structure. Although each package uses a different algorithm, the results are to some extent similar. All three packages are free and easy to use....
Article
Full-text available
This paper employs Kohonen self-organizing maps (SOMs) to examine economic and social convergence among Eurasian countries based on a simplify a large and complex set of twenty-eight socio-economic measures. for Eurasian states in order to identify clusters of Eurasian states and examine the extent to which these clusters have converged over time....
Article
Full-text available
The purpose of this article is to review two text mining packages, namely, WordStat and SAS TextMiner. WordStat is developed by Provalis Research. SAS TextMiner is a product Of SAS. We review the features offered by each package on each of the following key steps in analyzing unstructured data: (1) data preparation. including importing and cleaning...