Cláudio T Silva

Cláudio T Silva
New York University | NYU · Department of Computer Science and Engineering

PhD

About

305
Publications
125,469
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,527
Citations
Additional affiliations
June 2012 - present
July 2011 - present
New York University
Position
  • Professor (Full)
July 2003 - June 2011
University of Utah
Position
  • Professor (Full)

Publications

Publications (305)
Article
Full-text available
Urbanization has amplified the importance of three‐dimensional structures in urban environments for a wide range of phenomena that are of significant interest to diverse stakeholders. With the growing availability of 3D urban data, numerous studies have focused on developing visual analysis techniques tailored to the unique characteristics of urban...
Preprint
Full-text available
Machine learning and deep learning models are pivotal in educational contexts, particularly in predicting student success. Despite their widespread application, a significant gap persists in comprehending the factors influencing these models' predictions, especially in explainability within education. This work addresses this gap by employing nine...
Preprint
Full-text available
The development of machine learning applications has increased significantly in recent years, motivated by the remarkable ability of learning-powered systems to discover and generalize intricate patterns hidden in massive datasets. Modern learning models, while powerful, often have a level of complexity that renders them opaque black boxes, resulti...
Preprint
We introduce a novel calibration and reconstruction procedure for structured light scanning that foregoes explicit point triangulation in favor of a data-driven lookup procedure. The key idea is to sweep a calibration checkerboard over the entire scanning volume with a linear stage and acquire a dense stack of images to build a per-pixel lookup tab...
Article
Full-text available
Background In recent years, researchers have made significant strides in understanding the heterogeneity of breast cancer and its various subtypes. However, the wealth of genomic and proteomic data available today necessitates efficient frameworks, instruments, and computational tools for meaningful analysis. Despite its success as a prognostic too...
Preprint
[ IEEE VR 2024 Paper ] [ https://arxiv.org/abs/2402.00344 ] Current visualization research has identified the potential of more immersive settings for data exploration, leveraging VR and AR technologies. To explore how a traditional visualization system could be adapted into an immersive framework, and how it could benefit from this, we decided t...
Article
Full-text available
Many areas of the world are without basic information on the socioeconomic well-being of the residing population due to limitations in existing data collection methods. Overhead images obtained remotely, such as from satellite or aircraft, can help serve as windows into the state of life on the ground and help “fill in the gaps” where community inf...
Article
The concept of augmented reality (AR) assistants has captured the human imagination for decades, becoming a staple of modern science fiction. To pursue this goal, it is necessary to develop artificial intelligence (AI)-based methods that simultaneously perceive the 3D environment, reason about physical tasks, and model the performer, all in real-ti...
Preprint
The concept of augmented reality (AR) assistants has captured the human imagination for decades, becoming a staple of modern science fiction. To pursue this goal, it is necessary to develop artificial intelligence (AI)-based methods that simultaneously perceive the 3D environment, reason about physical tasks, and model the performer, all in real-ti...
Article
Full-text available
Access to high-quality data is an important barrier in the digital analysis of urban settings, including applications within computer vision and urban design. Diverse forms of data collected from sensors in areas of high activity in the urban environment, particularly at street intersections, are valuable resources for researchers interpreting the...
Article
Full-text available
Breast cancer is the second most common cancer type and is the leading cause of cancer-related deaths worldwide. Since it is a heterogeneous disease, subtyping breast cancer plays an important role in performing a specific treatment. In this work, we propose an evaluation framework that uses different machine learning techniques for a broader analy...
Article
Understanding the interpretation of machine learning (ML) models has been of paramount importance when making decisions with societal impacts, such as transport control, financial activities, and medical diagnosis. While local explanation techniques are popular methods to interpret ML models on a single instance, they do not scale to the understand...
Article
While esports organizations are increasingly adopting practices of conventional sports teams, such as dedicated analysts and data-driven decision-making, video-based game review is still the primary mode of game analysis. In conventional sports, advances in data collection have introduced systems that allow for sketch-based querying of game situati...
Preprint
Massively multiplayer online role-playing games often contain sophisticated in-game economies. Many important real-world economic phenomena, such as inflation, economic growth, and business cycles, are also present in these virtual economies. One major difference between real-world and virtual economies is the ease and frequency by which a policyma...
Article
Analyzing classification model performance is a crucial task for machine learning practitioners. While practitioners often use count-based metrics derived from confusion matrices, like accuracy, many applications, such as weather prediction, sports betting, or patient risk prediction, rely on a classifier's predicted probabilities rather than predi...
Preprint
Sports, due to their global reach and impact-rich prediction tasks, are an exciting domain to deploy machine learning models. However, data from conventional sports is often unsuitable for research use due to its size, veracity, and accessibility. To address these issues, we turn to esports, a growing domain that encompasses video games played in a...
Article
Full-text available
Noise is one of the primary quality‐of‐life issues in urban environments. In addition to annoyance, noise negatively impacts public health and educational performance. While low‐cost sensors can be deployed to monitor ambient noise levels at high temporal resolutions, the amount of data they produce and the complexity of these data pose significant...
Preprint
Predicting outcomes in sports is important for teams, leagues, bettors, media, and fans. Given the growing amount of player tracking data, sports analytics models are increasingly utilizing spatially-derived features built upon player tracking data. However, player-specific information, such as location, cannot readily be included as features thems...
Preprint
Analyzing classification model performance is a crucial task for machine learning practitioners. While practitioners often use count-based metrics derived from confusion matrices, like accuracy, many applications, such as weather prediction, sports betting, or patient risk prediction, rely on a classifier's predicted probabilities rather than predi...
Preprint
Full-text available
There is a lack of data on the location, condition, and accessibility of sidewalks across the world, which not only impacts where and how people travel but also fundamentally limits interactive mapping tools and urban analytics. In this paper, we describe initial work in semi-automatically building a sidewalk network topology from satellite imagery...
Preprint
Full-text available
Noise is one of the primary quality-of-life issues in urban environments. In addition to annoyance, noise negatively impacts public health and educational performance. While low-cost sensors can be deployed to monitor ambient noise levels at high temporal resolutions, the amount of data they produce and the complexity of these data pose significant...
Article
Full-text available
Sensor networks have dynamically expanded our ability to monitor and study the world. Their presence and need keep increasing, and new hardware configurations expand the range of physical stimuli that can be accurately recorded. Sensors are also no longer simply recording the data, they process it and transform into something useful before uploadin...
Preprint
Full-text available
Many audio applications, such as environmental sound analysis, are increasingly using general-purpose audio representations for transfer learning. The robustness of such representations has been determined by evaluating them across a variety of domains and applications. However, it is unclear how the application-specific evaluation can be utilized...
Article
Media streaming, with an edge-cloud setting, has been adopted for a variety of applications such as entertainment, visualization, and design. Unlike video/audio streaming where the content is usually consumed passively, virtual reality applications require 3D assets stored on the edge to facilitate frequent edge-side interactions such as object man...
Preprint
Full-text available
Media streaming has been adopted for a variety of applications such as entertainment, visualization, and design. Unlike video/audio streaming where the content is usually consumed sequentially, 3D applications such as gaming require streaming 3D assets to facilitate client-side interactions such as object manipulation and viewpoint movement. Compar...
Article
Full-text available
While designing sustainable and resilient urban built environment is increasingly promoted around the world, significant data gaps have made research on pressing sustainability issues challenging to carry out. Pavements are known to have strong economic and environmental impacts; however, most cities lack a spatial catalog of their surfaces due to...
Preprint
Local explainability methods -- those which seek to generate an explanation for each prediction -- are becoming increasingly prevalent due to the need for practitioners to rationalize their model outputs. However, comparing local explainability methods is difficult since they each generate outputs in various scales and dimensions. Furthermore, due...
Preprint
Full-text available
While designing sustainable and resilient urban built environment is increasingly promoted around the world, significant data gaps have made research on pressing sustainability issues challenging to carry out. Pavements are known to have strong economic and environmental impacts; however, most cities lack a spatial catalog of their surfaces due to...
Conference Paper
Full-text available
Graffiti is an inseparable element of most large cities. It is of critical value to recognize whether it is an artistry product or a distortion sign. This study develops a larger graffiti dataset containing a variety of graffiti types and annotated boundary boxes. We use this data to obtain a robust graffiti detection model. Compared with existing...
Preprint
Full-text available
Exploring large virtual environments, such as cities, is a central task in several domains, such as gaming and urban planning. VR systems can greatly help this task by providing an immersive experience; however, a common issue with viewing and navigating a city in the traditional sense is that users can either obtain a local or a global view, but n...
Preprint
Full-text available
The growth of cities calls for regulations on how urban space is used and zoning resolutions define how and for what purpose each piece of land is going to be used. Tracking land use and zoning evolution can reveal a wealth of information about urban development. For that matter, cities have been releasing data sets describing the historical evolut...
Preprint
Full-text available
Large-scale analysis of pedestrian infrastructures, particularly sidewalks, is critical to human-centric urban planning and design. Benefiting from the rich data set of planimetric features and high-resolution orthoimages provided through the New York City Open Data portal, we train a computer vision model to detect sidewalks, roads, and buildings...
Conference Paper
Full-text available
Large-scale analysis of pedestrian infrastructures, particularly sidewalks, is critical to human-centric urban planning and design. Benefiting from the rich data set of planimetric features and high-resolution orthoimages provided through the New York City Open Data portal, we train a computer vision model to detect sidewalks, roads, and buildings...
Article
Breast cancer is the second most common cancer type and is the leading cause of cancer-related deaths worldwide. Since it is a heterogeneous disease, subtyping breast cancer plays an important role in performing a specific treatment. Gene expression data is a viable alternative to be employed on cancer subtype classification, as they represent the...
Preprint
We introduce AlphaD3M, an automatic machine learning (AutoML) system based on meta reinforcement learning using sequence models with self play. AlphaD3M is based on edit operations performed over machine learning pipeline primitives providing explainability. We compare AlphaD3M with state-of-the-art AutoML systems: Autosklearn, Autostacker, and TPO...
Article
Full-text available
Background The trapezius muscle is often utilized as a muscle or nerve donor for repairing shoulder function in those with brachial plexus birth palsy (BPBP). To evaluate the native role of the trapezius in the affected limb, we demonstrate use of the Motion Browser, a novel visual analytics system to assess an adolescent with BPBP. Method An 18-ye...
Preprint
Full-text available
Visual place recognition (VPR) is critical in not only localization and mapping for autonomous driving vehicles, but also assistive navigation for the visually impaired population. To enable a long-term VPR system on a large scale, several challenges need to be addressed. First, different applications could require different image view directions,...
Preprint
The target of automatic Video summarization is to create a short skim of the original long video while preserving the major content/events. There is a growing interest in the integration of user's queries into video summarization, or query-driven video summarization. This video summarization method predicts a concise synopsis of the original video...
Conference Paper
Visual place recognition (VPR) is critical in not only localization and mapping for autonomous driving vehicles, but also assistive navigation for the visually impaired population. To enable a long-term VPR system on a large scale, several challenges need to be addressed. First, different applications could require different image view directions,...
Preprint
Full-text available
The outputs of win probability models are often used to evaluate player actions. However, in some sports, such as the popular esport Counter-Strike, there exist important team-level decisions. For example, at the beginning of each round in a Counter-Strike game, teams decide how much of their in-game dollars to spend on equipment. Because the dolla...
Article
Extracting and analyzing crime patterns in big cities is a challenging spatiotemporal problem. The problem's hardness is linked to the sparse nature of the crime activity and its spread in large spatial areas. Sparseness hampers most time series comparison methods from working properly, while handling large areas tends to render the computational c...
Preprint
Video summarization aims to simplify large scale video browsing by generating concise, short summaries that diver from but well represent the original video. Due to the scarcity of video annotations, recent progress for video summarization concentrates on unsupervised methods, among which the GAN based methods are most prevalent. This type of metho...
Article
Urban art constitutes an important issue in urbanism. Previous studies on the spatial distribution of graffiti rarely consider visual categories and how the city topology can impact graffiti production. In this work, after assigning graffiti occurrences to three categories, we analyzed their spatial distribution while searching for possible biases....
Article
Full-text available
Exploring large virtual environments, such as cities, is a central task in several domains, such as gaming and urban planning. VR systems can greatly help this task by providing an immersive experience; however, a common issue with viewing and navigating a city in the traditional sense is that users can either obtain a local or a global view, but n...
Preprint
Full-text available
Game review is crucial for teams, players and media staff in sports. Despite its importance, game review is work-intensive and hard to scale. Recent advances in sports data collection have introduced systems that couple video with clustering techniques to allow for users to query sports situations of interest through sketching. However, due to data...
Article
Full-text available
A large number of stroke survivors suffer from a significant decrease in upper extremity (UE) function, requiring rehabilitation therapy to boost recovery of UE motion. Assessing the efficacy of treatment strategies is a challenging problem in this context, and is typically accomplished by observing the performance of patients during their executio...
Preprint
Many esports use a pick and ban process to define the parameters of a match before it starts. In Counter-Strike: Global Offensive (CSGO) matches, two teams first pick and ban maps, or virtual worlds, to play. Teams typically ban and pick maps based on a variety of factors, such as banning maps which they do not practice, or choosing maps based on t...
Article
Full-text available
Interactive visualizations are at the core of the exploratory data analysis process, enabling users to directly manipulate and gain insights from data. In this article, we present three different ways in which interactive visualizations can be included in Jupyter Notebooks: 1) matplotlib callbacks; 2) visualization toolkits; and 3) embedding HTML v...
Preprint
Esports, despite its expanding interest, lacks fundamental sports analytics resources such as accessible data or proven and reproducible analytical frameworks. Even Counter-Strike: Global Offensive (CSGO), the second most popular esport, suffers from these problems. Thus, quantitative evaluation of CSGO players, a task important to teams, media, be...
Article
Multidimensional Projection is a fundamental tool for high-dimensional data analytics and visualization. With very few exceptions, projection techniques are designed to map data from a high-dimensional space to a visual space so as to preserve some dissimilarity (similarity) measure, such as the Euclidean distance for example. In fact, although ado...
Preprint
Full-text available
Despite the great differences among cities, they face similar challenges regarding social inequality, politics and criminality. Urban art express these feelings from the citizen point-of-view. In particular, the drawing and painting of public surfaces may carry rich information about the time and region it was made. Existing studies have explored t...
Article
We present an atmospheric model tailored for the interactive visualization of planetary surfaces. As the exploration of the solar system is progressing with increasingly accurate missions and instruments, the faithful visualization of planetary environments is gaining increasing interest in space research, mission planning, and science communicatio...
Article
In recent years, a wide variety of automated machine learning (AutoML) methods have been proposed to generate end-to-end ML pipelines. While these techniques facilitate the creation of models, given their black-box nature, the complexity of the underlying algorithms, and the large number of pipelines they derive, they are difficult for developers t...
Preprint
We present an atmospheric model tailored for the interactive visualization of planetary surfaces. As the exploration of the solar system is progressing with increasingly accurate missions and instruments, the faithful visualization of planetary environments is gaining increasing interest in space research, mission planning, and science communicatio...
Preprint
Multidimensional Projection is a fundamental tool for high-dimensional data analytics and visualization. With very few exceptions, projection techniques are designed to map data from a high-dimensional space to a visual space so as to preserve some dissimilarity (similarity) measure, such as the Euclidean distance for example. In fact, although ado...
Preprint
Full-text available
Urban planning is increasingly data driven, yet the challenge of designing with data at a city scale and remaining sensitive to the impact at a human scale is as important today as it was for Jane Jacobs. We address this challenge with Urban Mosaic,a tool for exploring the urban fabric through a spatially and temporally dense data set of 7.7 millio...
Preprint
With the increasing sophistication of machine learning models, there are growing trends of developing model explanation techniques that focus on only one instance (local explanation) to ensure faithfulness to the original model. While these techniques provide accurate model interpretability on various data primitive (e.g., tabular, image, or text),...
Preprint
Understanding the interpretation of machine learning (ML) models has been of paramount importance when making decisions with societal impacts such as transport control, financial activities, and medical diagnosis. While current model interpretation methodologies focus on using locally linear functions to approximate the models or creating self-expl...
Preprint
In data science, there is a long history of using synthetic data for method development, feature selection and feature engineering. Our current interest in synthetic data comes from recent work in explainability. Today's datasets are typically larger and more complex - requiring less interpretable models. In the setting of \textit{post hoc} explain...
Article
An understanding of person dynamics is indispensable for numerous urban applications, including the design of transportation networks and planning for business development. Pedestrian counting often requires utilizing manual or technical means to count individuals in each location of interest. However, such methods do not scale to the size of a cit...
Preprint
Full-text available
Predicting commuting flows based on infrastructure and land-use information is critical for urban planning and public policy development. However, it is a challenging task given the complex patterns of commuting flows. Conventional models, such as gravity model, are mainly derived from physics principles and limited by their predictive power in rea...