Question
Asked 23rd Aug, 2015

Hello everyone, can someone explain the best way to calculate the min. support for FP-Growht operator in RapidMiner or Weka?

I am text mining a text field with more than 10000 entries, the date is very very heterogenous, and I am trying to create associaton rules, so the results from the FP-Growth operator are very important, and I need to do the best calculation possible.

All Answers (3)

25th Aug, 2015
Antoon Bronselaer
Ghent University
Hello Lina,
choosing the value of the min support is always tricky. Usually, you choose the value of the min support relatively low. You will get a lot of itemsets (and rules derived from them), but you can  rank the frequent itemsets (or the derived rules) according to some measures of interestingness, like confidence, lift, conviction, leverage...
See for example the following paper:
Hope this helps.
1 Recommendation
26th Aug, 2015
Vasudha Bhatnagar
University of Delhi
If the support threshold is too low, due to large number of  frequent item-sets execution may give error.  I think, you have to resort to trial-and-error method. There is no thumb rule for  identifying the best support threshold.
Good luck.
1 Recommendation
3rd Sep, 2015
Luis ALBERTO Sanchez
National Open University and Distance (Colombia)
Suggest:
1) do an univariate statistical analisys in order to interpretate properly your data, found information such as mean, mode, media, asymmetric, varianza, and confidence statistical intervals, minimum, maximum.
2) Cross tables by variate and multivariate, correlations analysis.
3) Before jump into the wrong assumptions, you to be familiar and know the proper relations.
4) Assign that FP operator based on the previous Statistical Analysis.
Aurevoir
Luis Sanchez 

Related Publications

Preprint
Full-text available
By identifying a text's polarity, sentiment analysis is a technique for extracting information from a person's attitude about an issue or occurrence. The grouping is made to discuss whether the reader is positive or negative. The drop duplication procedure creates 4339 from the preceding 10997, and the result language detection is 31 languages, tha...
Got a technical question?
Get high-quality answers from experts.