About
10
Publications
8,762
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7
Citations
Introduction
Fake Review Detection,
Tuning Recommender Systems
Skills and Expertise
Current institution
Publications
Publications (10)
Along with the ever-increasing portfolio of products online, the incentive for market participants to write fake reviews to gain a competitive edge has increased as well. This article demonstrates the effectiveness of using different combinations of spam detection features to detect fake reviews other than the review-based features typically used....
The development of the DASC-PM (Data Science Process Model) is based on the knowledge of a large working group consisting of experts in Data Science. We believe that the procedure de-scribed in the DASC-PM can be used beneficially in data-driven projects and offers support in structuring complex projects. In version 1.1 of the DASC-PM, which was pu...
Die Entwicklung des DASC-PM (Data Science Process Model) basiert auf dem Wissen einer gro-ßen Arbeitsgruppe, die sich aus Expertinnen und Experten der Data Science zusammensetzt. Wir sind der Meinung, dass die im DASC-PM beschriebene Vorgehensweise in datengetriebenen Pro-jekten nutzbringend eingesetzt werden kann und eine Unterstützung bei der Str...
Zusammenfassung
Data-Science-Projekte sind typischerweise interdisziplinär, adressieren vielfältige Problemstellungen aus unterschiedlichen Domänen und sind häufig durch heterogene Projektmerkmale geprägt. Bestrebungen in Richtung einer einheitlichen Charakterisierung von Data-Science-Projekten sind insbesondere dann relevant, wenn über deren Durch...
In February 2020, the first version of a comprehensive process model for data science projects appeared: the Data Science Process Model (DASC-PM). The positive feedback we have received indicates we were able to contribute to the discussion of data science activities that we were hoping for. Over the last two years, the DASC-PM has found its way in...
Data Science ist in den Unternehmen angekommen. Wegen der vielfältigen Einsatzmöglichkeiten, etwa bei Entscheidungen, Prognosen und der Weiterentwicklung oder Neugestaltung von Produkten und Services, gibt es mehrere Vorgehensweisen bei entsprechenden Projekten.
----
Daurer, Stephan; Theuerkauf, René; Franke, Tony (2022): Vorgehensmodelle bei Data...
Im Februar 2020 erschien mit dem Data Science Process Modell (DASC-PM) die erste Version eines umfassenden Vorgehensmodells für Data-Science-Projekte. Die vielen positiven Rückmeldungen, die wir erhalten haben, zeigen uns, dass wir den erhofften Beitrag zur Diskussion rund um Data-Science-Aktivitäten leisten konnten. Das DASC-PM hat in den letzten...
Als Ergebnis einer virtuellen Arbeitsgruppe aus Wissenschaftler:innen und Praktiker:innen entstand zwischen April 2019 und Februar 2020 das Data-Science-Vorgehensmodell DASC-PM, dessen Ziel es ist, vorhandenes Wissen über die Durchführung von Data-Science-Projekten für alle Interessensgruppen in geeigneter Form zu strukturieren. Unter Berücksichtig...
For users, keeping track of the film industry's ever-growing range of movies and TV series is increasingly difficult and exacerbates the paradox of choice: that too many choices hinder decision-making. As a countermeasure, recommender systems can personalize offers and limit the variety of media to what users find relevant. In this paper, we presen...
Questions
Question (1)
Hello all together,
I am currently planning a research project to identify fake reviews on e-commerce platforms.
Desirable would be a labeled Amazon customer review dataset, like this one (https://jmcauley.ucsd.edu/data/amazon/) just extended. However, finding a labeled dataset is proving difficult.
Coincidentally, I came across this one (https://github.com/lievcin/amazon_deception). According to the information, this dataset is used in a Data Science Master Program adressing NLP tasks. In the given task (https://github.com/lievcin/amazon_deception/blob/master/NLP%20Assignment%201%20(Full).pdf) the dataset is presented as labeled by Amazon. Unfortunately, I find this very difficult to imagine. This dataset is also provided on Kaggle (https://www.kaggle.com/lievgarcia/amazon-reviews), for example, and discussed on Medium (https://medium.com/@lievgarcia/deception-on-amazon-c1e30d977cfd). In each case, we are dealing with the same author. According to my initial analyses, the data are not entirely implausible either.
The question for me is whether I can cite this data at all and if so, how could this be implemented? Can you please tell me how you would handle this issue if you wanted to use the data?
Best regards,
René