Archived project

Automated Exploration of Data and Models

Goal: Apply AI methods to make the exploration of data and models easier, faster and more efficient. The project will focus strongly on visualization and high-dimensional data.

Updates
0 new
3
Recommendations
0 new
0
Followers
0 new
20
Reads
1 new
270

Project log

Mateusz Staniak
added a research item
The increasing availability of large but noisy data sets with a large number of heterogeneous variables leads to the increasing interest in the automation of common tasks for data analysis. The most time-consuming part of this process is the Exploratory Data Analysis, crucial for better domain understanding, data cleaning, data validation, and feature engineering. There is a growing number of libraries that attempt to automate some of the typical Exploratory Data Analysis tasks to make the search for new insights easier and faster. In this paper, we present a systematic review of existing tools for Automated Exploratory Data Analysis (autoEDA). We explore the features of fifteen popular R packages to identify the parts of analysis that can be effectively automated with the current tools and to point out new directions for further autoEDA development.
Mateusz Staniak
added an update
The review is now on arxiv: https://arxiv.org/abs/1904.02101
 
Mateusz Staniak
added an update
I finished a preprint of a review of R packages (and some tools from other languages) dedicated to EDA automation
 
Mateusz Staniak
added an update
I am developing a growing list of software and papers related to various aspect of automated exploration of data (autoEDA) or models at
 
Mateusz Staniak
added a research item
The increasing availability of large but noisy data sets with a large number of heterogeneous variables leads to the increasing interest in the automation of common tasks for data analysis. The most time-consuming part of this process is the Exploratory Data Analysis, crucial for better domain understanding, data cleaning, data validation, and feature engineering. There is a growing number of libraries that attempt to automate some of the typical Exploratory Data Analysis tasks to make the search for new insights easier and faster. In this paper, we present a systematic review of existing tools for Automated Exploratory Data Analysis (autoEDA). We explore the features of twelve popular R packages to identify the parts of analysis that can be effectively automated with the current tools and to point out new directions for further autoEDA development.
Mateusz Staniak
added a project goal
Apply AI methods to make the exploration of data and models easier, faster and more efficient. The project will focus strongly on visualization and high-dimensional data.