Renato Cordeiro FerreiraTilburg University | UVT · Jheronimus Academy of Data Science (JADS)
Renato Cordeiro Ferreira
PhD Candidate in Computer Science
Scientific Programmer @ JADS | PhD Candidate in Computer Science @ USP | Co-founder & Coordinator @ CodeLab
About
28
Publications
998
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4
Citations
Introduction
Scientific Programmer at the Jheronimus Academy of Data Science (NL), working on the MARIT-D European project. PhD candidate at the University of São Paulo (BR), researching about MLOps and Intelligent Software Engineering. Former Principal Machine Learning Engineer at Elo7, delivering AI solutions into production. Co-founder of CodeLab, an extracurricular group whose goal is to help tech students become professional software engineers. On leisure time, D&D GM with >100 sessions (+300h) of play.
Additional affiliations
Education
December 2020 - June 2026
February 2016 - June 2020
February 2012 - January 2016
Publications
Publications (28)
Applying agile practices in data science requires adaptations. This paper describes challenges and lessons learned in two applied machine learning projects developed in the XP Lab course at University of São Paulo in Brazil. It compiles six suggestions for educators and practitioners who want to bring agility to their data science initiatives.
Respiratory insufficiency is a medic symptom in which a person gets a reduced amount of oxygen in the blood. This paper reports the experience of building SPIRA: an intelligent system for detecting respiratory insufficiency from voice. It compiles challenges faced in two succeeding implementations of the same architecture, summarizing lessons learn...
The A.D.A. – Advanced Distributed Assistant – project aims to build a smart distributed personal assistant, that is, a virtual agent that can interact with the user through an ecosystem of devices, such as IoT (Internet of Things), by voice commands in Portuguese. The project is divided into six scientific initiation subprojects from different areas...
The Hierarchical, Interactive and Dynamic Recognition Architecture (H.I.D.R.A.) for Product Categorization is a new intelligent system architecture developed by Elo7 to easily evolve its category tree and automatically classify millions of products, thus improving the page ranking of our marketplace.
The expansion of Data Science projects in organizations has been led by three factors: the growth in the amount of data generated, the evolution in storage capacity, and the increase in computational capabilities. However, most of these projects fail to deliver the expected value: 82% of the teams do not use any process model. Despite the popularit...
The expansion of Data Science projects in organizations has been led by three factors: the growth in the amount of data generated, the evolution in storage capacity, and the increase in computational capabilities. However, most of these projects fail to deliver the expected value: 82% of the teams do not use any process model. Despite the popularit...
This paper presents the journey of CodeLab: a student-organized initiative from the University of São Paulo that has grown thanks to university hackathons. It summarizes patterns, challenges, and lessons learned over 15 competitions organized by the group from 2015 to 2020. Unfortunately, the COVID-19 pandemic affected the group's institutional kno...
The Software Crisis has reached AI: according to Gartner’s report in 2021, only around 54% of AI products successfully reach production. Since the early days of software engineering, the rise of complexity has been known to play a key role in projects failing. The goal of this research is to investigate how complexity affects Machine Learning (ML)....
The Software Crisis has reached AI: according to Gartner’s report in 2021, only around 53% of AI products successfully reach production. Since the early days of software engineering, the rise of complexity has been known to play a key role in projects failing. The goal of this research is to investigate how complexity affects Machine Learning (ML)....
“Talk is cheap. Show me the code”. Using real code examples is a way of engaging students while teaching Software Engineering. By applying this technique, this paper describes the experience of introducing good development practices in the course “Programming Techniques II”, offered for students of the Bachelor in Computer Science of the Institute...
Presentation about the paper "SPIRA: Building an Intelligent System for Respiratory Insufficiency Detection"
Apresentação sobre MLOps para o Meetup de Data Science & Machine Learning do Nubank. Vídeo da apresentação em: https://renatocf.xyz/mlops-live
Presentation about the paper "Being Agile in a Data Science Project"
University Hackathons to Build a Community of Engaged Students This paper presents the history and evolution of CodeLab, an extension project that has been growning since 2015 through the use of university hackathons. It shows a detailed chronological description of all the competitions organized, contextualizing their goals and lessons learned. It...
Multi-label classification becomes costly in terms of processing as the number of categories increase. This directly impacts page ranking and product discovery, affecting the revenue of an e-commerce company. This experience report presents a reactive microservices architecture capable of classifying products in up to 10,000 categories in near real...
Presentation about the paper "Toward the development of A.D.A. - Advanced Distributed Assistant"
Presentation about the paper "H.I.D.R.A. - A Hierarchical, Interactive and Dynamic
Recognition Architecture for Product Categorization"
The rise of technological dependency made some requirements crucial to online systems, such as availability, and scalability. The microservices architectural style provides improvements to scalability and software maintainability and has been broadly adopted. Although, microservices highlight trade-offs between consistency and coupling level. This...
Developing complex systems using microservices is a current challenge. In this paper we present our experience with teaching this subject for more than 80 students at the University of São Paulo (USP), fostering team work and simulating the industry's environment. We show it is possible to teach such advanced concepts for senior undergraduate stude...
Developing complex systems using microservices is a current challenge. In this paper we present our experience with teaching this subject for more than 80 students at the University of São Paulo (USP), fostering team work and simulating the industry’s environment. We show it is possible to teach such advanced concepts for senior undergraduate stude...
Probabilistic Graphical Models (PGMs) are a class of machine learning models used for
sequence labeling and alignment. They are widely applied in many research fields, such as natural language processing, speech recognition, computer vision and bioinformatics. Firstly, this project provides a review of PGMs. It summarizes the relationship between 1...
Clients that depend on a class with multiple behaviors are unnecessarily tied to the behaviors they do not use. One possible solutions to split the class in many, but this is not a good strategy if different behaviors share common code. This paper introduces the Secretary pattern, which decreases the client’s coupling while keeping maximum reusabil...
ToPS (Toolkit for Probabilistic Models of Sequences) is a C++ framework used for training and inference of probabilistic models that describe finite sequences of symbols. It has a series of applications and a specification language, which allows configuring models without prior programming knowledge. In this project, we propose to refactor the fram...
Introdução O ToPS (Toolkit for Probabilistic Models of Sequences) é um arcabouço que contém 8 implementações de modelos probabilísticos publicadas [4] e outras em desenvolvimento. É utilizado como base do sistema criador de preditores de genes MYOP [3], usado em experimentos de bioinformática. Neste trabalho, objetivamos refatorar [1] o ToPS, de mo...
Encontrar genes é uma tarefa essencial para as pesquisas da biologia molecular moderna. Esse processo, chamado de anotação, tenta encontrar os segmentos de DNA que atuam direta e indiretamente na produção de proteínas-moléculas essenciais para o desenvolvimento de organismos complexos [2]. Uma das ferramentas mais importantes para identificar novos...
Encontrar genes é uma tarefa essencial para as pesquisas da biologia molecular moderna. Esse processo, chamado de anotação, tenta encontrar os segmentos de DNA que atuam direta e indiretamente na produção de proteínas-moléculas essenciais para o desenvolvimento de organismos complexos. Uma das ferramentas mais importantes para identificar novos gen...
Encontrar genes é uma tarefa essencial como guia na biologia molecular moderna. Eles são usados na síntese de proteínas - moléculas essenciais para o desenvolvimento de organismos complexos [2]. Uma ferramenta importante para identificar genes são os preditores de genes ab initio, que utilizam modelos estatísticos para automatizar o processo de bus...
Encontrar genes é uma tarefa essencial como guia na biologia molecular moderna. Eles são usados na síntese de proteínas - moléculas essenciais para o desenvolvimento de organismos complexos [2]. Uma ferramenta importante para identificar genes são os preditores de genes ab initio, que utilizam modelos estatísticos para automatizar o processo de bus...