CMSC 491W/691W Exam


  1. 20 pointsYou are given the following data set, which shows how 6 binary input variables affect a (binary) output variable. Construct a ID3 based decision tree as well as a perceptron (using the handout provided). Rate the two approaches in terms of the time needed to construct them, their "size" and their accuracy. Finally, cluster this data using any one of the approaches presented in class (FCM, SAHN, K-means etc.)
    
    	    {1,0,1,0,0,0} 1
    	    {1,0,1,1,0,0} 1
    	    {1,0,1,0,1,0} 1
    	    {1,1,0,0,1,1} 1
    	    {1,1,1,1,0,0} 1
    	    {1,0,0,0,1,1} 1
    	    {1,0,0,0,1,0} 0
    	    {0,1,1,1,0,1} 1
    	    {0,1,1,0,1,1} 0
    	    {0,0,0,1,1,0} 0
    	    {0,1,0,1,0,1} 0
    	    {0,0,0,1,0,1} 0
    	    {0,1,1,0,1,1} 0
    	    {0,1,1,1,0,0} 0
    
  2. 15 points A grocery store sells items coded a through z. Given that the following records represent items sold in a transaction, you are asked to (i) Generate the largest possible itemset with support greater than 5%, and (ii) Generate all associations with a singleton consequent from this set and list their confidence.
    {a, b, c, d, f}, {a, f, g}, {a, c, g, h, k}, {b, d, e, k}, {e, l, m}, {l, n, p}, {a, p, q}, {b, q, r}, {c, q, r}, {p, r, t}, {d, p, r, s, t, u}, {s, t, u}, {b, s, t, u}, {u, v, z}, {u, v, w}, {w, z}
  3. (15 points) The apriori algorithm we have discussed in class to generate association rules is quite time consuming, since it is forced to make repeated scans on the database as it incrementally constructs large item sets. Over the last few years, several researchers have suggested that parallel computing can be used to speed up the apriori algorithm. Your task in this question is to propose some parallelization schemes for association rule generation. You should discuss your answer from the shared memory (each processor can access all the memory directly) perspective. Provide details of the schemes and comment on their scalability. You may search the compendex database for references to articles on this topic and base your answer on reading those. For Extra Credit, discuss the distributed memory case as well

Anupam Joshi
Last modified: Tue Apr 20 13:42:15 EDT 1999