-
Question:New Does anybody know about research building links between inter-observer reliability indices like kappa and recommender systems?Inter-observer reliability indices like Cohen's Kappa etc. are usually applied to densely populated contingency matrices with very few classes while r... [more]
-
Answer added to:4 Can someone help me in understanding what is k-means and r-means and their importance in data mining ?Why don't you use a better clustering algorithm. K-means is one of the simplests and also with less accuracy. I might suggest you this paper in order ... [more]
-
Answer added to:42 Is there any clustering method for clustering users by their features e.g their profile properties?Some of my colleagues had good results using Correlation-based Feature Selection. Data analysis software supports this technique, e.g. R, NumPy, and W... [more]
-
Answer added to:5 What is the border line between syntax and semantic analysis in dependency parsing technique stack?Rule-Based Semantic Tagging. An Application Undergoing Dictionary Glosses http://arxiv.org/abs/1305.3882
-
Answer added to:9 For users of large data sets, how does the establishment of a users' network enhance the user, understanding and accuracy of the data?At my work we have a listserve to post questions and thoughts about a very complex health care data set. I find it more useful than a users group mee... [more]
-
Answer added to:5 Text mining and Data mining ?Data Mining (DM) is not just a single method or single technique but rather a spectrum of different approaches, which searches for patterns and relati... [more]
-
Answer added to:1 What are the parameter options in Latent Semantic Indexing?
-
Answer added to:7 What are the best similarity measurement for categorical data sets? and which feature selection method is perfect for them?
-
Answer added to:2 What are the research challenges in data mining for anomaly detectionWhile you can tweak the anomaly detection algorithms for better performance, the biggest challenge is in making sure that the data itself is accurate,... [more]
-
Answer added to:2 Is there any relation between semantic web and big data? Will semantic web help in translation of data to information?What if we create a standard for Big Data across multiple disciplines. Can semantic web be more useful that way. I guess that will transfer processing... [more]
-
Answer added to:1 Association Rule Mining in Cloud ComputingI dont know more about cloud compting, but I know about association rule. association rule is usually used to mine data association with each other, h... [more]
-
Answer added to:21 Which methods are there for content retrieval from HTML documents?You can use one document classification SDKs such as uClassify. It is a good document classifier project. It is provided as a SDK for a number of lan... [more]
-
Answer added to:23 How can we recognize the pattern of a high dispersion dataset?Do you think frcatal distance measurment can solve this problem? Do you have any idea about fractal clustering to predict this dataset?
-
Answer added to:4 What can be done using KDDcup dataset and real network traffic dataset in data mining?Mrs Melah, different data means other benchmark data such as iscx dataset while new data refer to own generated data.
-
Answer added to:10 Which will be a good machine learning technique for implementing text categorization?I think LDA is the best for this purpose.
-
Question:Open What from your experience are the best methods/algorithms to simultaneously treat heterogeneous brute traces for pattern mining purpose?Let's consider that we have 3 sources of activity traces generated from the use of an Intelligent Tutoring System. These traces are heterogeneous in t... [more]
-
Question:Open Are there any data sets available for bug tossing in mining software repositories?Bug Triaging is a vital task in software maintenance. Bug Triaging is done using both content of the bug as well as the tossing relationships among de... [more]
-
Question:Open Why are anonymization and de-identification models useless in genomic data privacy?Genomic data privacy is an essential thing while sharing the genomic data to the public. How can the privacy of genomic data be protected? Which anony... [more]
-
Answer added to:3 Where to get Frequent itemset mining datasets?You can also find UCI machine learning dataset that is discretized and normalised for frequent itemsets mining in the followin link. http://csc.liv.a... [more]
-
Answer added to:2 How to measure the quality of classification rules?There exist numerous interestingness measures for association patterns (thus for classification rules). You should start by two research articles : "S... [more]
-
Answer added to:7 Intrusion detection using DataminingThanks to all for your suggestions, and sorry because i was not active for last 1 year in the community....... So unable to respond but its done n i h... [more]
-
Answer added to:1 Goce data.Dear Miguel, raw data processing of satellite gravimetry is quite complicated and needs min. a year to learn for practical use. If you want to mine ge... [more]
-
Answer added to:6 How can you calculate the description length of a random forest?Thanks Oliver and Paul
-
Answer added to:4 Has anyone implemented any privacy preserving data mining technique?there are few open source implementations on the web that consider privacy-preserving data mining even though the privacy preserving data mining is a... [more]
-
Answer added to:2 GPU accelerated solver for a large scale sparse Quadratic Program (QP)?Respected Sir, We cordially invite you to contribute book chapter for the book "Cloud Infrastructures for Big Data Analytics". The proposed book is t... [more]
-
Answer added to:1 Where can I find more examples on how to use SPARQL aggregates?Learning SPARQL by Bob DuCharme has a lot of nice examples on how to use SPARQL aggreagates. http://www.amazon.com/Learning-SPARQL-ebook/dp/B005EI86BS... [more]
-
Answer added to:7 MS SQL, Mysql, MS access like this which are other database solutions? How can we compare them wrt distributed database features availability ?You can read my paper on comparative study on Cloud Database...In this paper I have made comparision 16 Cloud Databases
-
Answer added to:2 Anybody knows how to implement Matrix inversion in Hadoop?Mahout does not contain code for that. Do you really need the inverse of the matrix or do you just want to solve a linear system? If you aim for the ... [more]
-
Question:Open What methods have recently proposed for preserving the string data in data publishing?.
-
Answer added to:10 What is the best and most commonly used machine learning technique?This question was originally asked by David Wolpert. In his worked he derived the No Free Lunch Theorem. This states that if you have no prior assump... [more]
About Data Mining and Knowledge Discovery
It is the research project which is ongoing.