FTC algorithm
Hi..does anyone know about FTC algorithm (Frequent Term-Based Text Clustering) ?
i have a difficulty to cluster the terms..
thanks so much
this is the algorithm :
FTC(database D, float minsup)
SelectedTermSets:= {};
n:= |D|;
RemainingTermSets:= DetermineFrequentTermsets(D, minsup);
while |cov(SelectedTermSets)| Ââ n do
for each set in RemainingTermSets do
Calculate overlap for set;
BestCandidate:=element of RemainingTermSets with minimum overlap;
SelectedTermSets:= SelectedTermSets ¾ {BestCandidate};
RemainingTermSets:= RemainingTermSets -{BestCandidate};
Remove all documents in cov(BestCandidate) from D and from the coverage of all of the
RemainingTermSets;
return SelectedTermSets and the cover of the elements of SelectedTermSets;
|