## 这是一篇来自美国的关于并行算法的CS代写

The following is the Reading list for CS 214: Parallel Algorithms. This reading list includes recent advances for parallel algorithms, categorized in various topics. You should read and select some topics for paper reading and the course presentation from the this list.

We have covered many algorithms during the lectures, but as a 10-week class, we do not have time to cover many algorithms for important problems, and ideas for parallel algorithm design. As a result, we provide this list of papers with brief introduction. You can read these papers to enhance your understanding of the course content, and pick topics for the optional course presentations in the last week. You can either give an overview talk for the high-level ideas in several papers, focus on one specific paper, or just introduce one algorithm in one paper (but with more depth).

If you want to give this talk, make sure you contact the instructor regarding the content, and make sure the length fits in the time slots.

If the slots for course presentation is full, you can alternatively submit paper reading for bonus credits. A paper reading should contains 1000–3000 words, and will be graded by the quality of the writing.

**1 GENERAL IDEAS FOR PARALLEL ALGORITHMS **

In class, we mainly focus on the binary fork-join model. We mainly talk about the simplest algorithm for each problem, and mention in several places that the bounds can be improved by some recent papers. Many of these results are from the follow paper: “Optimal Parallel Algorithms in the Binary-Forking Model” [11]. These algorithms are quite simple, and you are welcome to discuss some of them.

Space consumption is another critical objective we should take care of when designing parallel algorithms, as they are supposed to process large-scale data. A parallel algorithm that are spaceefficient can run larger instances on the same machine, and is presumably faster since they have smaller memory footprint. Many interesting space-efficient parallel algorithms are designed in this paper: “Parallel In-Place Algorithms: Theory and Practice” [35]. You are welcome to talk about the high-level idea in designing these algorithms, and provide some specific examples.

Given a graph with a certain computational DAG, are there some general approaches to parallelize this computation? We refer you to the following two papers: “Internally Deterministic Parallel Algorithms Can Be Fast” [10] and “Many Sequential Iterative Algorithms Can Be Parallel and (Nearly) Work-efficient” [42] for some interesting discussions.

**2 SEQUENCE ALGORITHMS **

Sequence algorithms, which are algorithms working on 1D static data such as sorting, are crucial primitives for general algorithm design. You are welcome to read the following papers and discuss some of them in details. These papers include: comparison sort (sample sort) [4, 13], integer sort [41],and semisort [27, 36]; prefix sums [5]; random permutation, list ranking, and tree contraction [11,35, 48].

**3 DATA STRUCTURES **

In class we talk about parallel data structure design, and the advantages over the classic concurrent data structures. We use hash table and binary trees as examples, and we welcome some in-depth discussion and/or analysis for these algorithms. These papers include: hash tables [44]; binary search trees [8, 9, 53], range, interval and segment trees [51], and persistent trees [52]; priority queue [25]; Cartesian trees [45] and suffix arrays [38]; cover trees [34]; van Emde Boas trees [32]; union-find [50]; and rake-compress trees [1].

**4 GRAPHS ANALYTICS **

Graph analytics is a crucial component in algorithm design and parallel algorithm research. Research has span across programming-level abstraction, specific algorithms, and support for dynamic updates, compression, etc.

Regarding graph-processing systems, there has been extensive study on this topic. Here we only list a small number of them, but you are welcome to read and present works not on this list. Ligra (A Lightweight Graph Processing Framework for Shared Memory [43]) is one of the earlist systems focusing on abstracting and optimizing BFS-like round-based traversal algorithms. Julienne (A Framework for Parallel Graph Algorithms using Work-efficient Bucketing [21]) extended Ligra to accommodate applications such as shortest-paths and *𝑘*-core in an efficient way. GraphChi (Large-Scale Graph Computation on Just a PC [39]) is an early and influential disk-based work, and SAGE [6, 24] focus on non-volatile main memories (NVRAMs).

Parallel graph algorithms are notoriously hard to design, but over the decades, many simple and elegant algorithms have been shown. We will list the state-of-the-art algorithms for famous problems. Graph connectivity is extensively studied by these two papers [23, 46]. The latest biconnectivity algorithms, the FAST-BCC algorithm, is given in [26]. On directed graph, the problem is refer to strong connectivity. The latest algorithm, the BGSS algorithm, is given in [11], and the implementation is in [55]. The state-of-the-art shortest path algorithms are given in [25], inspired by earlier work [18, 40]. Other interesting graph algorithms include: *𝑘*-core [21], maximal independent set, maximal matching, and graph coloring [12, 42].

Finally, it has been a long history on dealing with dynamic evolving graphs. Early work includes STINGER [28], among many others. A recent breakthrough is to use dynamic trees to represent the edge lists of the CSR format, which supports efficient updates, multi-versioning, and hybrid update and query. This series of work include the first version, Aspen [22], and the state-of-the-art version, PaC-tree [20]. Both solutions rely on graph compression techniques, and more discussions can be found in [47].

**5 GEOMETRY PROCESSING **

While graphs study the relationship among objects, geometry directly studies the locations among objects. There have been many recent advances for parallel geometry processing, and we provide a list of references if you are interested in discussing.

The first type of work is data structures that organizes data in low-to-mid dimension space. Aside from range, interval and segment trees [51] and cover trees [34] mentioned above, other work of interest includes bounding volume hierarchies [31, 54] and *𝑘*-d trees [7, 15].

The second groups of work are parallel algorithms for classic geometric problems. A recent work by Blelloch et al. [16] shows parallel algorithms for a long list of problems such as Delaunay triangulation, low-dimension LP, closest pairs [58], etc. The latest convex hull paper is [17], and the classic divide-and-conquer approach is [30].

**3 DATA STRUCTURES **

In class we talk about parallel data structure design, and the advantages over the classic concurrent data structures. We use hash table and binary trees as examples, and we welcome some in-depth discussion and/or analysis for these algorithms. These papers include: hash tables [44]; binary search trees [8, 9, 53], range, interval and segment trees [51], and persistent trees [52]; priority queue [25]; Cartesian trees [45] and suffix arrays [38]; cover trees [34]; van Emde Boas trees [32];

union-find [50]; and rake-compress trees [1].

**4 GRAPHS ANALYTICS **

Graph analytics is a crucial component in algorithm design and parallel algorithm research. Research has span across programming-level abstraction, specific algorithms, and support for dynamic updates, compression, etc.

Regarding graph-processing systems, there has been extensive study on this topic. Here we only list a small number of them, but you are welcome to read and present works not on this list. Ligra (A Lightweight Graph Processing Framework for Shared Memory [43]) is one of the earlist systems focusing on abstracting and optimizing BFS-like round-based traversal algorithms. Julienne (A Framework for Parallel Graph Algorithms using Work-efficient Bucketing [21]) extended Ligra to accommodate applications such as shortest-paths and *𝑘*-core in an efficient way. GraphChi (Large-Scale Graph Computation on Just a PC [39]) is an early and influential disk-based work, and SAGE [6, 24] focus on non-volatile main memories (NVRAMs).

Parallel graph algorithms are notoriously hard to design, but over the decades, many simple and elegant algorithms have been shown. We will list the state-of-the-art algorithms for famous problems. Graph connectivity is extensively studied by these two papers [23, 46]. The latest biconnectivity algorithms, the FAST-BCC algorithm, is given in [26]. On directed graph, the problem is refer to strong connectivity. The latest algorithm, the BGSS algorithm, is given in [11], and the implementation is in [55]. The state-of-the-art shortest path algorithms are given in [25], inspired by earlier work [18, 40]. Other interesting graph algorithms include: *𝑘*-core [21], maximal independent set, maximal matching, and graph coloring [12, 42].

Finally, it has been a long history on dealing with dynamic evolving graphs. Early work includes STINGER [28], among many others. A recent breakthrough is to use dynamic trees to represent the edge lists of the CSR format, which supports efficient updates, multi-versioning, and hybrid update and query. This series of work include the first version, Aspen [22], and the state-of-the-art version, PaC-tree [20]. Both solutions rely on graph compression techniques, and more discussions can be found in [47].

**5 GEOMETRY PROCESSING **

While graphs study the relationship among objects, geometry directly studies the locations among objects. There have been many recent advances for parallel geometry processing, and we provide a list of references if you are interested in discussing.

The first type of work is data structures that organizes data in low-to-mid dimension space. Aside from range, interval and segment trees [51] and cover trees [34] mentioned above, other work of interest includes bounding volume hierarchies [31, 54] and *𝑘*-d trees [7, 15].

The second groups of work are parallel algorithms for classic geometric problems. A recent work by Blelloch et al. [16] shows parallel algorithms for a long list of problems such as Delaunay triangulation, low-dimension LP, closest pairs [58], etc. The latest convex hull paper is [17], and the classic divide-and-conquer approach is [30].