UMBC CMSC202, Computer Science II, Spring 1998, Sections 0101, 0102, 0103, 0104 and Honors

Project 3: A Sparse Matrix ADT

Due: Wednesday, March 18, 1998

Objective

The objective of the project is to gain experience working with dynamically allocated arrays and complex data structures for an abstract data type.

Background

Matrices are usually stored in a two-dimensional array. Thus, a matrix with m rows and n columns would use O(mn) storage space. A sparse matrix is a matrix where "most" of its entries are zero. Using a two-dimensional array to store a sparse matrix is a terrible waste of memory, because most of the memory would used to store zeroes. In this project, you will explore the use of a complex data structure to store sparse matrices and you will implement several matrix operations to support this data structure.

Assignment

The data structure and the function prototypes of the matrix operations that you must implement are stored in the header file:

~chang/pub/cs202/proj3/sparse.h

You should copy this file, but you should not modify it in any way. The data structure for sparse matrices is described by the following type definitions:

typedef struct { int column_index ; double value ; } entry ; typedef struct { int count ; int limit ; entry *row ; } row_header ; typedef struct { int rows ; int columns ; row_header *headers ; } matrix ;

Instead of storing the entries of a sparse matrix as an element of a two-dimensional array, each entry will be represented as a record of type entry. The first field of this vector stores the column of the entry, the second field stores the value of the entry. All the entries of the matrix that appear in the same row are grouped together as a dynamically allocated array, sorted by the column index field. The location of this array is stored in the third field, row, of a record of type row_header. The first field of this record, count, contains the number of non-zero entries in that row of the sparse matrix. The second field, limit, is used to store the amount of space currently allocated for the array that the row field points to. The limit field is useful for reallocating the space for this array. Since we need a record of type row_header for each row of the matrix, these can be grouped together as a dynamically allocated array. We record the location of this array of row_header's in the third field of a matrix record. The other two fields of this record is used to store the dimensions of the sparse matrix. Finally, we will usually work with a pointer to a record of type matrix, rather than the record itself. So, we define a new type for a pointer to a matrix.

typedef matrix *matrix_ptr ;

Again, you should not modify any of these type definitions. For example, you are not allowed to add extra fields to any of the records.

You must also implement the following operations for sparse matrices for your project.

matrix_ptr LoadMatrix(char *filename) ; int StoreMatrix(char *filename, matrix_ptr A) ;

The LoadMatrix function opens the file given by filename and creates a sparse matrix data structure for the matrix entries stored in that file. The file is a text file with the following format. The first line contains the number of rows and the number of columns in the matrix. Each succeeding line contains the row, column and value of a non-zero entry of the matrix. Note that the entries do not appear in any particular order. LoadMatrix should return a pointer to the matrix data structure created, or NULL if any errors are encountered. StoreMatrix is the reverse of LoadMatrix. StoreMatrix should return 1 if the operation is successful and 0 otherwise.

matrix_ptr AddMatrix(matrix_ptr A, matrix_ptr B) ; matrix_ptr SubtractMatrix(matrix_ptr A, matrix_ptr B) ;

The AddMatrix function creates a new matrix which contains the sum of the two given matrices and returns a pointer to the new matrix. Note that addition for matrices is only defined for matrices with the same dimensions. Each entry of the sum is simply the sum of the corresponding entries in A and B. Subtraction is defined similarly. Your implementation of AddMatrix and SubtractMatrix must exploit the fact that these are sparse matrices. In particular, you should not consider entries that are zero in both matrices.

matrix_ptr MultiplyMatrix(matrix_ptr A, matrix_ptr B) ;

The MultiplyMatrix function creates a new matrix which contains the product of the two given matrices. The return value is a pointer to the new matrix. The product of two matrices is defined mathematically as follows. Let A be a matrix with r rows and s columns and let B be a matrix with s rows and t columns. The product of A and B is a matrix C with r rows and t columns. Note that the number of columns in A must equal the number of rows in B. Let A(i,j), B(i,j) and C(i,j) denote the entry in the i-th row and j-th column of the respective matrices. Then, the value in C(i,j) is defined as the sum A(i,1)B(1,j) + A(i,2)B(2,j) + ... + A(i,s)B(s,j).

matrix_ptr PowerMatrix(matrix_ptr A, int n) ;

The PowerMatrix function creates a new matrix which contains Aⁿ, the product of the matrix A with itself n times. This is only defined for square matrices -- i.e., matrices with the same number of rows as columns. Your implementation of the power function should be recursive and use the fact that if n is even, then Aⁿ = A^n/2 A^n/2 and if n is odd, then Aⁿ = A A^n/2 A^n/2. The return value is a pointer to the new matrix. The return value should be NULL for n <= 0.

matrix_ptr CopyMatrix (matrix_ptr A) ;

The CopyMatrix function creates a new matrix which has the same entries as the matrix A. Note that each row of the matrix must be duplicated --- i.e., the matrix A and the new matrix must not share any storage space. The return value is a pointer to the new matrix.

void FreeMatrix (matrix_ptr A) ;

The FreeMatrix function frees all dynamically allocated storage associated with the matrix A. Calls to FreeMatrix should be made if A is no longer needed.

Implementation Issues

For this project, efficiency in terms of storage space and running time are important considerations. Inefficient implementations will have points deducted. The following are some efficiency issues.

In LoadMatrix, you should read through the file only once.
When you build a matrix, it is faster to insert all the elements in a row and sort the entire row after all the entries have been added.
The entries in each row is sorted by column index. When you search for an entry with a certain column index, you should use binary search.
The amount of storage you use for a sparse matrix should be proportional to the number of non-zero entries in the matrix. For example, you must not use a two dimensional array to store a sparse matrix even temporarily. This defeats the whole purpose of the sparse matrix data structure.
When you reallocate the size of a dynamically allocated array, you should double its size. Increasing the size by 1 is horribly inefficient.

Other implementation notes:

The matrix may contain rows where all values are zero. In this case, the row field of a row_header record would be NULL. This is the source of many bugs.
Some test files will be made available in the directory ~chang/pub/cs202/proj3/ in the near future. However, you should not depend solely upon these files to check the correctness of your program. The graders will be using other examples to test your programs.

Turning in your program

You should submit the following 4 files with the following contents. The names of these files must be exact.

A file called sparse.c which contains the implementations of all of the sparse matrix operations. It is important that this file does not contain a function called main.
A file called main.c which contains a main function that tests the sparse matrix operations and reports the results of the tests. This file should contain only the main function and no other functions. In particular, the functions implemented in sparse.c should not depend on anything in main.c.
A file called typescript which contains sample runs of your project.
A file called README which describes which functions you have implemented and any note that you would like to pass along to the grader.

You should not submit the sparse.h header file. Since you are not allowed to change this file, your project should compile just fine if the graders use my copy. You may submit other header files or other .c files. Of course, none of these should have names that conflict with the above.

One purpose of these explicit directions is to allow the graders to compile your functions with a main function that checks the correctness of your functions. It is important that this can be done automatically without human intervention and without having to ask you to resubmit any files, which would cause unnecessary delays in grading. Project submissions that cannot be tested automatically will have a mandatory 10% deduction.

Last Modified: 5 Mar 1998 00:20:41 EST by Richard Chang

Back up to Spring 1998 CMSC 202 Section Homepage