George John

George John
Khosla Ventures

About

30
Publications
23,236
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
16,118
Citations

Publications

Publications (30)
Conference Paper
In 2008, Rocket Fuel's founders saw a gap in the digital advertising market. None of the existing players were building autonomous systems based on big data and artificial intelligence, but instead they were offering fairly simple technology and relying on human campaign managers to drive success. Five years later in 2013, Rocket Fuel had the best...
Article
When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods fo...
Article
: Data mining is an umbrella term referring to the process of discovering patterns in data, typically with the aid of powerful algorithms to automate part of the search. These methods come from disciplines such as statistics, machine learning (artificial intelligence) , pattern recognition, neural networks, and databases. Two data analysts with dif...
Article
Machine learning algorithms for supervised learning are in wide use. An important issue in the use of these algorithms is how to set the parameters of the algorithm. While the default parameter values may be appropriate for a wide variety of tasks, they are not necessarily optimal for a given task. In this paper, we investigate the use of cross-val...
Article
In delayed reinforcement learning, an agent is concerned with the problem of discovering an optimal policy, a function mapping states to actions. The most popular delayed reinforcement learning technique, Q-learning, has been proven to produce an optimal policy under certain conditions. However, often the agent does not follow the optimal policy fa...
Article
Successful technology becomes invisible. Few people think much about internal combustion engines while they drive to work in three-thousand-pound hunks of metal powered by them, or electricity while it enables countless parts of their modern lives. Data mining has a long way to go before it succeeds in this way -- or does it? At KDD-98, the Behind-...
Article
Full-text available
: In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider...
Chapter
Full-text available
In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider h...
Article
Full-text available
We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small high-accuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features into useful categories of relevance. We present de...
Article
We approach stock selection for long/short portfolios from the perspective of knowledge discovery in databases and rule induction: given a database of historical information on some universe of stocks, discover rules from the data that will allow one to predict which stocks are likely to have exceptionally high or low returns in the future. Long/sh...
Article
We approach the problem of stock selection from the perspective of knowledge discovery in databases: given a database of several years of quarterly information on over a thousand companies, discover patterns in the data that will allow one to predict which stocks are likely to have exceptional returns in the future. The database includes measures o...
Article
As data warehouses grow to the point where one hundred gigabytes is considered small, the computational efficiency of data-mining algorithms on large databases becomes increasingly important. Using a sample from the database can speed up the datamining process, but this is only acceptable if it does not reduce the quality of the mined knowledge. To...
Article
In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular domain, a feature subset selection method should consider how the...
Article
Finding and removing outliers is an important problem in data mining. Errors in large databases can be extremely common, so an important property of a data mining algorithm is robustness with respect to errors in the database. Most sophisticated methods in machine learning address this problem to some extent, but not fully, and can be improved by a...
Article
We present a new method for the induction of tree-structured recursive partitioning classifiers that use a neural network as the partitioning function at each node in the tree. Our technique is appropriate for pattern recognition tasks with many continuous inputs and a single multivalued nominal output. This paper presents two main contributions: 1...
Article
We address the problem of finding the parameter settings that will result in optimal performance of a given learning algorithm using a particular dataset as training data. We describe a “wrapper” method, considering determination of the best parameters as a discrete function optimization problem. The method uses best-first search and crossvalidatio...
Article
We present CHILS , the Convex Hull Inductive Learning System, a novel supervised learning algorithm based on approximating concepts with sets of convex hulls. We introduce a theoretical methodology for describing the power of a concept representation language and use it to compare convex hulls with other geometrical concept representations.
Article
When mining large databases, the data extraction problem and the interface between the database and data mining algorithm become important issues. Rather than giving a mining algorithm full access to a database (by extracting to a flat file or other directlyaccessible data structure), we propose the SQL Interface Protocol (SIP), which is a framewor...
Article
We present a new method for the induction of classification trees with linear discriminants as the partitioning function at each internal node. This paper presents two main contributions: first, a novel objective function called soft entropy which is used to identify optimal coefficients for the linear discriminants, and second, a novel method for...
Article
Full-text available
In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider h...
Article
Full-text available
We present MLC++ , a library of C ++ classes and tools for supervised Machine Learning. While MLC++ provides general learning algorithms that can be used by end users, the main objective is to provide researchers and experts with a wide variety of tools that can accelerate algorithm development, increase software reliability, provide comparison too...
Article
Full-text available
In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To a c hieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider...
Article
Thesis (Ph. D.)--Stanford University, 1997. Includes bibliographical references (leaves 179-194). Photocopy.
Article
High-quality financial databases have existed for many years, but human analysts can only scratch the surface of the wealth of knowledge buried in this data. Using the rule-induction technology in the Recon data-mining system, an investment strategy based purely on the learned rules can generate significant profits
Conference Paper
Discusses the weight update rule in the cascade correlation neural net learning algorithm. The weight update rule implements gradient descent optimization of the correlation between a new hidden unit's output and the previous network's error. The author presents a derivation of the gradient of the correlation function and shows that his resulting w...
Conference Paper
Full-text available
We address the problem of finding the parametersettings that will result in optimalperformance of a given learning algorithmusing a particular dataset as training data.We describe a "wrapper" method, consideringdetermination of the best parametersas a discrete function optimization problem.The method uses best-first search and crossvalidationto wra...
Article
Full-text available
In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To a c hieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider...
Conference Paper
Full-text available
We present MLC++, a library of C++ classes and tools for supervised machine learning. While MLC++ provides general learning algorithms that can be used by end users, the main objective is to provide researchers and experts with a wide variety of tools that can accelerate algorithm development, increase software reliability, provide comparison tools...
Conference Paper
We present a new method for top-down induction of decision trees (TDIDT) with multivariate binary splits at the nodes. The primary contribution of this work is a new splitting criterion called soft entropy, which is continuous and differentiable with respect to the pa- rameters of the splitting function. Using simple gradi- ent descent to find mult...

Network

Cited By