To be truly parallel, sorting a single list when multiple CPU cores are available should show a significant speedup over a single threaded approach. Radix sorting lends itself to truly parallel implementations; consult the literature for approaches you might consider taking.
Remember that MSD is a sorting outcome, not a sorting algorithm so investigate sorting algorithms that lend themselves to parallel implementation.
SSH into any of the CSE111 teaching servers using your CruzID Blue credentials:
$ ssh <cruzid>@<server>.soe.ucsc.edu
Create a suitable place to work: ( only do this the first time you log in )
$ mkdir –p ~/CSE111/Assignment4
$ cd ~/ CSE111/Assignment4
Install the lab environment: ( only do this once )
$ tar xvf /var/classes/CSE111/Assignment4.tar.gz
Build, test, and check code coverage of the system:
$ make test
Check your implementation for memory leaks:
$ make valgrind
Calculate your expected grade:
$ make grade
( this will take some time )
Reset the system if you get into a mess:
$ make clean
Note that using shared machines like the CSE111 teaching servers leads to variable results when another user suddenly starts executing their tests whilst yours are running. On the automated grading system, your code will have exclusive access to all 24 CPUs, so will performance far more predictably.
Your completed radix sorter must do two things:
- Correctly MSD radix sort large vectors of unsigned integers
- Exhibit a significant speedup as more CPU cores are used
In addition, you must:
- Have no compiler warnings or memory leaks
- Demonstrate 100% function, line, and branch coverage
The first being a functional requirement, the second be a non-functional (performance) requirement.
These requirements are NOT equally weighted – see Grading Scheme below. However, it is recommended you work on the functional requirement first, and only then work on the non-functional requirement.
Remember: “make it work” always comes before “make it fast”, whilst always striving to “make it right”.
What you need to do
The class ParallelRadixSort is provided, but unimplemented. You need to implement it without changing the existing signature. You can add member variables and functions, but do not change existing signatures or overide the default constructor.
Basic steps are as follows:
- Investigate how a truly parallel MSD radix sort can be implemented
- Implement a multi-threaded truly parallel version of ParallelRadixSort::msd()
To execute your parallel MSD radix sorter after running make:
$ ./radix 10000 1 9 -d
Where “10000” is the number of random unsigned integers that the test harness will generate, “1” is the number of times it will generate that many random unsigned integers, and “9” is the maximum number of CPU cores to use when sorting the single list.
The -d flag requests a sampled dump of the sorted vectors to demonstrate (hopefully) correct ordering.
A more strenuous test might be:
$ ./perf 500000 1 4
Which will indicate the speedup achieved (or not) by your multi-threaded implementation. 100% indicates you archived no speedup, 200% indicates you doubled the performance, and so on. Speedups around 300% are easily achievable with simple implementations, anything over 400% will require significant effort and a sophisticated implementation.
As in previous assignments if there’s something you don’t understand, do this, in this order:
- Google it
- Post a discussion on Slack
- Come along to office hours and ask questions