![]()
In this project the students will be coding the Apriori algorithm to mine association rules for the discretized gene expression data. They should implement the hash-tree data structure and use it to count the candidates, generate candidate sets and then prune these candidate sets. Students can also do interestingness analysis on the mined associations rules.
Students should first implement their Apriori algorithm for a reduced dataset with around 15 genes, and then repeat for 20, 30 and 40 genes.
Possible extensions (extra-credit):- Student can implement some tidset based algorithms. Other options could be to implement the FP-growth algorithm, the Max-Miner algorithm, the Charm algorithm, and the Border-Differential algorithm.