Data Mining and Knowledge Discovery

Data Mining and Knowledge Discovery

  • Question:
    New Does anybody know about research building links between inter-observer reliability indices like kappa and recommender systems?
    Inter-observer reliability indices like Cohen's Kappa etc. are usually applied to densely populated contingency matrices with very few classes while r... [more]
    By Matthias Scheller Lichtenauer · Empa - Swiss Federal Laboratories for Materials Science and Technology
  • Answer added to:
    4 Can someone help me in understanding what is k-means and r-means and their importance in data mining ?
    By Arun Rajendran · University of South Wales
    Antonio Quintana · Center for Research and Advanced Studies of the National Polytechnic Institute
    Why don't you use a better clustering algorithm. K-means is one of the simplests and also with less accuracy. I might suggest you this paper in order ... [more]
  • Answer added to:
    42 Is there any clustering method for clustering users by their features e.g their profile properties?
    By Afarin Adami · Islamic Azad University of Najafabad
    Michael Brückner · Naresuan University
    Some of my colleagues had good results using Correlation-based Feature Selection. Data analysis software supports this technique, e.g. R, NumPy, and W... [more]
  • Answer added to:
    5 What is the border line between syntax and semantic analysis in dependency parsing technique stack?
    By Alexander Solovyev · Bauman Moscow State Technical University
    Daniel Christen · Independent Researcher
    Rule-Based Semantic Tagging. An Application Undergoing Dictionary Glosses http://arxiv.org/abs/1305.3882 
  • Answer added to:
    9 For users of large data sets, how does the establishment of a users' network enhance the user, understanding and accuracy of the data?
    By Aoife Lawton · Health Service Executive
    Rick Tivis · Idaho State University
    At my work we have a listserve to post questions and thoughts about a very complex health care data set. I find it more useful than a users group mee... [more]
  • Answer added to:
    5 Text mining and Data mining ?
    By Waseem Alromimah · Ain Shams University
    Dr. Samaher Hussein Ali · University of Babylon
    Data Mining (DM) is not just a single method or single technique but rather a spectrum of different approaches, which searches for patterns and relati... [more]
  • Answer added to:
    1 What are the parameter options in Latent Semantic Indexing?
    By Muna Alsallal · Coventry University
  • Answer added to:
    7 What are the best similarity measurement for categorical data sets? and which feature selection method is perfect for them?
    By Afarin Adami · Islamic Azad University of Najafabad
  • Answer added to:
    2 What are the research challenges in data mining for anomaly detection
    By Deepthi Prasad · B. N. M. Institute of Technology
    Shreeder Adibhatla · General Electric Company
    While you can tweak the anomaly detection algorithms for better performance, the biggest challenge is in making sure that the data itself is accurate,... [more]
  • Answer added to:
    2 Is there any relation between semantic web and big data? Will semantic web help in translation of data to information?
    By Shahkar Tramboo · University of Kashmir
    Shahkar Tramboo · University of Kashmir
    What if we create a standard for Big Data across multiple disciplines. Can semantic web be more useful that way. I guess that will transfer processing... [more]
  • Answer added to:
    1 Association Rule Mining in Cloud Computing
    By Sudhakar Singh · Banaras Hindu University
    Nur Hayatin · University of Muhammadiyah Malang
    I dont know more about cloud compting, but I know about association rule. association rule is usually used to mine data association with each other, h... [more]
  • Answer added to:
    21 Which methods are there for content retrieval from HTML documents?
    By Nils Haldenwang · Universität Osnabrück
    Milan Tair · Singidunum University
    You can use one document classification SDKs such as uClassify. It is a good document classifier project. It is provided as a SDK for a number of lan... [more]
  • Answer added to:
    23 How can we recognize the pattern of a high dispersion dataset?
    By Vahid Nouri · Islamic Azad University Mashhad Branch
    Vahid Nouri · Islamic Azad University Mashhad Branch
    Do you think frcatal distance measurment can solve this problem? Do you have any idea about fractal clustering to predict this dataset? 
  • Answer added to:
    4 What can be done using KDDcup dataset and real network traffic dataset in data mining?
    By Noor Melah · German-Malaysian Institute
    Warusia Mohamed · Putra University, Malaysia
    Mrs Melah, different data means other benchmark data such as iscx dataset while new data refer to own generated data. 
  • Answer added to:
    10 Which will be a good machine learning technique for implementing text categorization?
    By Lanwin Lobo · St Aloysius College
    Haibo Zheng · Nanjing University of Posts and Telecommunications
    I think LDA is the best for this purpose. 
  • Question:
    Open What from your experience are the best methods/algorithms to simultaneously treat heterogeneous brute traces for pattern mining purpose?
    Let's consider that we have 3 sources of activity traces generated from the use of an Intelligent Tutoring System. These traces are heterogeneous in t... [more]
    By Ben-Manson Toussaint · Ecole Superieure d'Infotronique d'Haiti
  • Question:
    Open Are there any data sets available for bug tossing in mining software repositories?
    Bug Triaging is a vital task in software maintenance. Bug Triaging is done using both content of the bug as well as the tossing relationships among de... [more]
    By Akila Gopu · Pondicherry Engineering College
  • Question:
    Open Why are anonymization and de-identification models useless in genomic data privacy?
    Genomic data privacy is an essential thing while sharing the genomic data to the public. How can the privacy of genomic data be protected? Which anony... [more]
    By Mahesh Ramadoss · Alagappa University
  • Answer added to:
    3 Where to get Frequent itemset mining datasets?
    By P.Prabhu P · Alagappa University
    Hemavathi Chandrasekaran · Thiagarajar College of Engineering
    You can also find UCI machine learning dataset that is discretized and normalised for frequent itemsets mining in the followin link. http://csc.liv.a... [more]
  • Answer added to:
    2 How to measure the quality of classification rules?
    By Gnana Sankaran · Bharathidasan University
    Dominique Gay · Orange Labs
    There exist numerous interestingness measures for association patterns (thus for classification rules). You should start by two research articles : "S... [more]
  • Answer added to:
    7 Intrusion detection using Datamining
    By Ankita Gaur · Rajiv Gandhi Proudyogiki Vishwavidyalaya
    Ankita Gaur · Rajiv Gandhi Proudyogiki Vishwavidyalaya
    Thanks to all for your suggestions, and sorry because i was not active for last 1 year in the community....... So unable to respond but its done n i h... [more]
  • Answer added to:
    1 Goce data.
    By Miguel Arias · National University of Colombia
    Gábor Timár · Eötvös Loránd University
    Dear Miguel, raw data processing of satellite gravimetry is quite complicated and needs min. a year to learn for practical use. If you want to mine ge... [more]
  • Answer added to:
    6 How can you calculate the description length of a random forest?
    By Pouya Sinaian · Blekinge Institute of Technology
    Pouya Sinaian · Blekinge Institute of Technology
    Thanks Oliver and Paul 
  • Answer added to:
    4 Has anyone implemented any privacy preserving data mining technique?
    By Jitendra Jindal · School Of Computer Engineering
    Tamer AbuHmed · Inha University
    there are few open source implementations on the web that consider privacy-preserving data mining even though the privacy preserving data mining is a... [more]
  • Answer added to:
    2 GPU accelerated solver for a large scale sparse Quadratic Program (QP)?
    By Marco Trincavelli · Örebro universitet
    Ganesdh Deka · Directorate General of Employment & Training
    Respected Sir, We cordially invite you to contribute book chapter for the book "Cloud Infrastructures for Big Data Analytics". The proposed book is t... [more]
  • Answer added to:
    1 Where can I find more examples on how to use SPARQL aggregates?
    By Adrian Brasoveanu · MODUL University Vienna
    Alex Thomo · University of Victoria
    Learning SPARQL by Bob DuCharme has a lot of nice examples on how to use SPARQL aggreagates. http://www.amazon.com/Learning-SPARQL-ebook/dp/B005EI86BS... [more]
  • Answer added to:
    7 MS SQL, Mysql, MS access like this which are other database solutions? How can we compare them wrt distributed database features availability ?
    By Karimkhan Pathan · Nirma University
    Ganesdh Deka · Directorate General of Employment & Training
    You can read my paper on comparative study on Cloud Database...In this paper I have made comparision 16 Cloud Databases 
  • Answer added to:
    2 Anybody knows how to implement Matrix inversion in Hadoop?
    By Nikzad Babaii Rizvandi · University of Sydney
    Sebastian Schelter · Technische Universität Berlin
    Mahout does not contain code for that. Do you really need the inverse of the matrix or do you just want to solve a linear system? If you aim for the ... [more]
  • Question:
    Open What methods have recently proposed for preserving the string data in data publishing?
    By Mahesh Ramadoss · Alagappa University
  • Answer added to:
    10 What is the best and most commonly used machine learning technique?
    By Khalid Raza · Jamia Millia Islamia
    Arturo Geigel · Polytechnic University of Puerto Rico
    This question was originally asked by David Wolpert. In his worked he derived the No Free Lunch Theorem. This states that if you have no prior assump... [more]

About Data Mining and Knowledge Discovery

It is the research project which is ongoing.

Topic Followers (6425) See all