Best代写-最专业靠谱代写IT | CS | 留学生作业 | 编程代写Java | Python |C/C++ | PHP | Matlab | Assignment Project Homework代写

Matlab代写|CS 211: High Performance Computing Project 2

Matlab代写|CS 211: High Performance Computing Project 2


Part 1 (50 points)

Attached is a Matlab program to solve the general linear system Ax=b and verify the
solution with the Matlab build-in solver. There are two approaches in the framework of this

(1). Call the function dgetrf() in LAPACK ( ) to perform the
LU factorization of the coefficient matrix A and then call the function dtrsm() in LAPACK
to perform the forward substitution first and then call it again to perform the backward
substitution. It has been implemented in the framework, to verify your correctness of
the next implemented approach by yourself.

(2). Call the mydgetrf() and mydtrsv() implemented by yourself to perform the LU
factorization, forward substitution, and backward substitution. You need to implement it
by your own, with adding the necessary codes into the for_you_to_do.c file.

Your C functions mydgetrf() and mydtrsv() should follow the same algorithm as the
Matlab code. Do not perform any advanced code optimization or use any compiler
optimization flag (there can be trivial differences between your code and the MATLAB
code). The framework will test your codes with random matrices of size 1000, 2000, 3000,
4000, 5000 on TARDIS. Compare the performance (i.e., Gflops) of the two approaches.

Part 2 (50 points)

Implement the blocked GEPP algorithm mydgetrf_block() in the lecture with adding codes
into for_you_to_do.c. (1). Solve the linear system through your blocked version code using
your own matrix multiplication code. (2). Optimize your code to achieve as high
performance as possible using any other techniques available to you. Compare your
performance with your un-optimized version in Part 1.

Note that, in syllabus, we emphasize for ALL homework assignments: “Please make sure
that your programs are properly documented and indented. Provide instructions on how to
run your programs, give example runs, and analyze your results.”