Algorithms and Computational Geometry Undergraduate Research

Reprinted from: Journal of Computing in Small Colleges (JSCS) 13:5 (1998) 173--186.

The Undergraduate Algorithms Course and Recent Research in Computational Geometry

Matthew Dickerson

Mathematics and Computer Science

Middlebury College

Middlebury, Vermont 05753

dickerso@midd-unix.middlebury.edu

R.L. Scot Drysdale

Computer Science

Dartmouth College

Hanover, New Hampshire 03755

scot@flume.cs.dartmouth.edu

1. INTRODUCTION

An important topic which has begun to receive attention is that of integrating recent research into the undergraduate curriculum. There are many benefits to such an integration, among the more important of which are: the technology transfer of important results to a community of undergraduate students, many of whom will take their education directly into industry; the opportunity for students to get a clearer sense of recent developments in their field, and what current research in computer science looks like; and the opportunity for faculty (especially at institutions that are not research-oriented) to become familiar with these results.

One prolific research area in computer science in the past two decades, and one in which there has been a tremendous number of interesting results, is computational geometry. Computational geometry is concerned with algorithms, data structures, and geometric properties (including combinatoric and probabilistic) used in the solution to algorithmic problems of a geometric nature. Though the roots of computational geometry can be traced back to Euclid, it is in many ways a relatively young field even by the standards of computer science. It has, however, had a great impact on computer science and specifically on algorithms. One reason for this impact is the large number of algorithmic results and methods coming from the field. A second reason is the number of important applications. Problems of computational geometry are at the root of such different fields as geographic information systems, computer graphics, virtual reality, and medical imaging. It also plays a major role in computer vision and in certain aspects of robotics such as motion planning. Indeed, the list of applications of computational geometry could fill several pages and includes such seemingly unrelated areas as statistical analysis (for which convex hulls have been used) and classification in large databases (which can often be viewed as geometric proximity problems in higher dimensions). Finally, the field has proven to be an excellent area in which to incorporate undergraduate students into active research; the authors of this paper and several others have seen undergraduates achieve considerable success on original research problems in computational geometry. (See, for example, [DPBD98] appearing also in the volume.)

As suggested, bringing some recent research results from computational geometry into the undergraduate classroom can not only bring new energy to a syllabus but can also aid in the technology transfer process as future workers in many of these fields are exposed to advanced techniques and results from recent research. There are many places where such results from computational geometry can be integrated. The most obvious would be a focused course on computational geometry. Unfortunately, though several colleges and universities do offer such a course at the undergraduate level, many smaller departments either do not have the staff for new electives or the enrollments to justify them. And even where such a course is taught, it is usually taught as an elective taken by only a small number of computer science majors. Fortunately, there are several other avenues in which recent research results can be naturally incorporated in the curriculum, including many courses that can benefit greatly from a computational geometry component. For example, undergraduate courses in computer graphics are commonly taught, and provide an excellent and natural opportunity to present such results from computational geometry. Similarly, material from computational geometry will naturally arise in a course on robotics or vision. Even introductory computer science courses (CS1, Data Structures (CS2), and Discrete Mathematics) may benefit from the integration of recent research results, especially in computational geometry.

In this paper we discuss a traditional Undergraduate Algorithms Course such as is part of the core computer science major at most institutions. We discuss how some recent research results from computational geometry fit very naturally into this course, and briefly argue that it is more useful than other more "traditional" material. Specifically, we suggest that the inclusion of this proposed material (in replacement of rather than in addition to other material), while successfully exposing students to real research, will simultaneously enhance the course in other areas as well, making the course more interesting and relevant without putting an additional burden on the instructor to add yet more material to an already full course. The authors of this paper have found several advantages of such an approach to an undergraduate algorithms course. In particular, we will discuss some problems from computational geometry that provide excellent examples of: important advanced data structures; advanced divide-and-conquer techniques; dynamic programming, the greedy approach, and optimization problems; the incremental approach; and approximation algorithms.

2. THE ALGORITHMS COURSE

We begin with a brief outline of the typical algorithms course and where computational geometry fits in. For most computer science majors, The Algorithms Course is taken some time between late in the sophomore year and the senior year, and has as a prerequisite a course on data structures. Topics of the algorithms course typically include many or most of the following:

Some combinatorics and probability (for complexity analysis);
Advanced sorting and searching techniques;
The Divide-and-Conquer approach;
Dynamic Programming;
The Greedy Approach;
Linear Programming;
Amortized Analysis; and
Approximation Algorithms.

There are several textbooks written for this course. One of the best and most popular for an undergraduate algorithms course is Introduction to Algorithms by Cormen, Leiserson, and Rivest (McGraw Hill). Both authors of this paper have used and recommend that book. However, there are literally dozens of other choices which may work better at your institution for one reason or another.

Typical algorithms textbooks draw most of their examples from the area of graph problems. For example, even the text by Cormen et al., which has one entire chapter devoted computational geometry, has five full chapters devoted to graph algorithms. Though there are unquestionably many elegant results in graph theory, especially spanning tree problems, shortest path problems, and network flows, the applications of these algorithms are not as immediately relevant to many students as problems in computational geometry. Furthermore, though it may be argued that these graph algorithms are famous enough and important enough that every computer science major should student them, it is also the case that only a few of these algorithms represent what might be called recent research. On the other hand, the wealth of applications of several problems in computational geometry as well as the relative youth of these results suggest that they might provide better examples for numerous algorithmic topics. Furthermore, the visual nature of geometric problems lends itself very nicely to illustration and animation of algorithms. Finally, it can be argued that robustness is a very important subject in algorithm design, and one that is often ignored; geometric problems (far more than graph problems whose nature tends to be inherently discrete) lend themselves to discussion of robustness issues.

3. A COLLECTION OF TOPICS AND EXAMPLES FROM COMPUTATIONAL GEOMETRY

In this section we discuss a number of geometric algorithms that could be included in an algorithms course. Including all of them would turn the Algorithms course into a Computational Geometry course, which may not be desirable and is certainly not the goal of this paper. However, an instructor could choose a few of the following algorithms to enhance the Algorithms course. Note also that we do not present the algorithms in an extensive manner. The goal here is to present only the flavor of the algorithms and data structures; the details can be found in the references. Additional material is available for the individual instructor once she has decide which algorithms would be most suitable.

3.1 convex hulls

We first present the famous convex hull problem. Informally, the convex hull of a set of points in the plane is what you would get if you drove a nail at each point and then stretched a rubber band around the set. The rubber band would come into contact with the "outermost" points. Figure 1 above shows a set of points and its convex hull. This structure is useful in various clustering and separation problems. For example, two sets of points can be separated by a line if and only if the interiors of their convex hulls do not intersect. Convex hulls also play a similar role in computational geometry to the role that sorting plays in other algorithms: they organize the extremal points of the set into a structure that is ordered, so that they can be sequentially processed or binary searched. For example, algorithms for finding the diameter (furthest pair of points) and width (narrowest pair of parallel lines that contain all the points between them) of a set of points and for finding the furthest points from a given line all start by computing the convex hull of the points.

The algorithms for computing the convex hull vary as much as algorithms for sorting, and can be visually illustrated with nails and boards, rubber bands, and strings. There are incremental algorithms that take time proportional to O(hn) [Ja73], where h is the number of points on the convex hull. There are divide and conquer algorithms that are similar to Mergesort (Preparata and Hong) and to Quicksort (Floyd-Eddy). The former is O(nlog n) while the later is O(n²) worst case but O(n) on the average for uniformly distributed points. The proof of this fact is easy, geometric, and uses the fact that eliminating a constant fraction of a set of points each time (in this case half the remaining points) leads to an O(n) algorithm. Other algorithms use pre-processing to achieve linear expected time under certain assumptions on the distribution of points.

Graham [Gr72] invented a O(n log n) algorithm that starts by sorting the points into polar order about some point, forming a simple polygon. He then walks around this polygon, eliminating points that form concave angles. A simple argument shows that this entire walk takes linear time, and it is one of the easiest examples of an amortized analysis. The algorithm, the proof of its correctness, and the analysis are all instructive. It is also instructive to have students come up with counterexamples that show that this walk around the polygon eliminating concave angles will not work for an arbitrary simple polygon. This leads to interesting discussions about what it means to give a convincing argument that an algorithm is correct.

For these reasons, the first author of this paper spends the first week of the algorithm course on convex hull algorithms, covering three or four famous convex hull algorithms (including Graham's scan, Jarvis' march, and divide-and-conquer), as well as some heuristics for speeding up these algorithms. (See [PS85, Ch.3] and [BKOS97,Chs 1.1 and 11].) Students do some written homework on convex hulls as well as an implementation of one or more algorithms, learning quickly that for even a simple problem there are a wealth of techniques with their own advantages and disadvantages, and also that many algorithmic problems (e.g. diameter and width) can often be solved using other well-known results (convex hulls).

Figure 2

3.2 intersection algorithms

There are a number of problems involving the computation of intersections between geometric objects. Perhaps the most interesting of these for an algorithms course is a sweep-line algorithm to find all intersections between n line segments. The algorithms works in time O((n+I)log n) and O(n) space, where I is the number of intersections reported [BS79,PS91]. (See also [BKOS97,Ch.2].)

The basic idea of a sweep-line algorithm is to turn a 2-dimensional static algorithm into a 1-dimensional dynamic one. The approach is to sweep a vertical line across the set of segments from left to right and to keep track of the order that the segments which intersect this sweep line lie along the line. See Figure 2. As the sweep line moves across the segments, intersections appear (when the line reaches a left endpoint of a segment) and disappear (when the line passes beyond the right endpoint of a segment). The segments at given location of the sweep line are ordered from bottom to top. This order changes precisely when segments intersect.

The sweep-line algorithm is a discrete event simulation. The events are: encountering a left endpoint; encountering a right endpoint; and two segments changing order (intersecting). At first glance it seems that one would have to know all of the intersection points in advance to do this simulation, but it turns out that intersections can be computed "on the fly" as previous events are processed. These ideas (changing a 2-D static problem into a 1-D dynamic one, discrete event simulation, sweep-line, and discovering intersections before the sweepline reaches them) are interesting and powerful. This is also a nice data structures problem involving a balanced binary tree to keep track of the order of segments along the sweep line and a priority queue to keep track of future events.

3.3 range searching

A problem that has obvious applications in graphical user interfaces is the following: given a set of points in the plane and an axis-aligned query rectangle, quickly report which points lie within that rectangle. The problem generalizes to higher dimensions, where the query object is an axis-aligned box. In this form it can be used to express data base queries of the form, "Give me all people in the data base between the ages of 25 and 35 who earn between $30,000 and $50,000 a year".

There is a simple extension of a binary tree called a k-d tree (for k-dimensional tree) that can be used to solve problems of this kind. The idea is to build a binary tree where different levels do key comparisons in different dimensions. These trees compare points on their first coordinate at the root of the tree, on their second coordinate at the second level down, ... through the d^th coordinate at the d^th level of the tree. The process then repeats for the (d+1)^st through the (2d)^th levels, and so on. The second author has successfully assigned implementing k-d trees and used them to solve graphically-presented range queries and closest-point queries in a Data Structures course. It is an application of trees and tree traversals that can be made very concrete via the graphical user interface.

Other data structures for solving range searching problems include quad trees (2-dimensional) and oct trees (3-dimensional), where instead of splitting along one dimension we split along two dimensions or three dimensions at the same time, leading to 4-way or 8-way branching. These structures are very useful in computer graphics and in volume representations as well. For those who wish to explore the subject more deeply, there are recursively defined range trees that achieve very good worst-case time bounds. Two books by H. Samet [Sa90a,Sa90b] explore these topics in greater depth. (See also [BKOS97, Ch.5].)

3.4 planar point location

Another question that arises in graphical user interfaces is the problem of deciding which window or button a mouse click occurs in. This is a special case of a more general problem: given a point and a subdivision of the plane into regions, decide which region contains the point. (The regions are usually polygonal or bounded by fairly simple curves). A more complex example occurs in geographic information systems, where a map is divided into political subdivisions (nations, states, counties, etc.) and the job is to decide in which region a given point lies. (See [BKOS97, Ch.6].)

The naive algorithm requires O(n) time to solve this problem, where n is the total number of boundary segments of all regions. However, it is possible to preprocess the subdivision in a way that allows us do much better than this. In fact, several approaches exist for solving this problem in O(log n) search time and O(n) space.

One approach, by Kirkpatrick [Ki83], works for polygonal subdivisions. He first triangulates the subdivision. He then creates a sequence of subdivisions, each containing a constant fraction fewer vertices than the previous one, ending up with a subdivision consisting of a single triangle. He does this by finding a set of low-degree vertices (no two of which are adjacent), eliminating them from the triangulation, and re-triangulating the "holes" that are left. (This set of vertices is chosen using a greedy algorithm.) Viewing these subdivisions in reverse order, he starts with a very coarse subdivision and refines it to finer and finer subdivisions at each level. He first locates a query point in the coarsest subdivision. In constant time he can then locate it in the next finer subdivision. Knowing the region that the point lies in this subdivision allows him to locate it in constant time in the next finer subdivision, and so on until he locates it in the original subdivision. The time required is O(# levels of subdivision), which is O(log n).

3.5 closest-point problems and the Voronoi diagram

A famous problem is the closest-point problem: given a set of points, find which pair is closest together. This problem can be solved in any dimension using a divide-and-conquer approach. It provides an excellent illustration of this approach, in part because a naive merge step yields an O(n²) algorithm. The "trick" to achieving an optimal O(n log n) algorithm requires both a geometric observation of a "sparsity" condition and also a sorting on both x and y coordinates (in two-dimensions) [BS76].

There is, however, a very different approach to the closest-point problems which has the advantage of introducing a new geometric construct that not only directly solves this problem but dozens of other problems as well. The Voronoi diagram of a set of n points in the plane (called "sites") is a subdivision of the plane into n regions, one for each site. See Figure 3a. Each site's region consists of all points in the plane closer to that site than to any of the other sites.

This data structure has been rediscovered dozens of times in fields including crystallography (modeling crystals), metallurgy (modeling "grain growth" in thin metal films), biology (modeling territories of animals and growth of colonies of clover), meteorology (estimating regional rainfall), geography (defining "market regions" for towns), and archeology (defining regions controlled by hill forts). In computer science it has been used to solve a wide array of problems. The first was Knuth's "post-office" problem: given the coordinates to which a letter is to be delivered, find the nearest post-office to the destination. If you know in which region of the Voronoi diagram the point lies you also know the nearest post-office. It is the site associated with the region. Other problems efficiently solves with this structure include finding the nearest site to each of the other sites, finding the minimum spanning tree of the sites, and finding the point within a boundary polygon that is farthest from any of the given sites.

The Voronoi diagram can be computed in O(n log n) time by a divide-and-conquer algorithm. Incremental algorithms with bucketing also work well in practice for uniformly distributed sites. There are a number of other algorithms known, including a sweep-line algorithm and an elegant method that computes the 2-dimensional Voronoi diagram by computing a 3-dimensional convex hull. Aurenhammer [Au91] wrote an excellent survey of Voronoi diagram algorithms, applications, and extensions and even more extensive treatment can be found in a book by Okabi, Boots, and Sugihara [OBS92]. (See also [PS85,Chs 5,6] and [BKOS97,Ch.7].)

3.6 optimization problems (triangulations, dynamic programming, and greedy algorithms)

A triangulation of a point set is a maximum straight-line planar graph with the points as vertices. Informally, if you take a set of points and continue to draw non-intersecting straight edges between pairs of points until no more edges can be drawn without intersections, you have a triangulation. See Figure 3b above. Note that a triangulation partitions the plane into a set of triangular regions inside the convex hull of the point set (hence the name "triangulation") plus one unbounded region containing everything outside the convex hull. Similarly, a simple polygon of n vertices can be triangulated by added n-3 non-intersecting internal diagonals. See Figure 4.

One of the most important applications of triangulations are as meshes in computer-aided modeling. They are also interesting to study because of their interesting combinatorial and geometric properties. A polygon as well as a set of points both have a very large number of different triangulations. In fact, triangulations of a convex polygon have (surprisingly) a very similar combinatorial structure to the number of binary trees. For example the number of triangulations of an n-vertex convex polygons is the (n-2)^nd Catalan number, which is also the number of binary trees of n-2 vertices, and the number of ways of associating n-2 matrices for multiplication.

Since not all triangulations are equal (just as not all binary trees are equal), considerable energy has been devoted to the study of various types of triangulations. One can give several different criteria by which a triangulation could be measured, such as the maximum angle, the minimum angle, the total length of the edges, etc. These leads to the goal of optimization problems, as for any of these criteria one can seek to compute the optimum triangulation.

One interesting triangulation is the Minimum Weight Triangulation, defined as the triangulation whose total edge length is smallest of all possible triangulations. For a simple polygon, the minimum weight triangulation can be computed in O(n³) time using dynamic programming. The technique is very simple, and provides an excellent example of the dynamic programming technique [Gi79]. The basic observation is that any triangle on a polygon partitions the polygon into two sub-polygons, each of which must be triangulated optimally for the overall triangulation to be optimal. Since each convex hull edge is part of exactly one triangle, we simply have to choose which other vertex is the third vertex of that triangle. There are n-2 possible choices, and each partitions the problem into two disjoint subproblems.

Interestingly, the minimum weight triangulation problem is not yet solved for general point sets. In fact, it is still open whether the problem is in P or is NP-hard. Dynamic programming will not work efficiently to solve the problem, since a triangle no longer partitions the set into two disjoint subsets. This suggest alternate approaches either for special cases in which the minimum triangulation can be computed exactly [DKM97] or heuristics for approximating the minimum weight triangulation [DMM95]. One such approach is the so-called greedy approach [DDMW97]. The greedy approach tests possible edges one at a time in non-decreasing order by length, adding those edges which intersect none of the previously added edges until no more edges can be added. This is a nice example of how a greedy approach can be implement efficiently, but does not produce an exact answer to the optimization problem. There have been several papers exploring the implementation of the greedy triangulation including a 1979 paper by Gilbert [Gi79] giving an O(n²logn) time and O(n²) space algorithm for the greedy triangulation of a point set to a much more recent O(n log n) time and O(n) space approach by Wang [Wa94]. Numerous other papers have explored both theoretically and empirically how good an approximation the greedy triangulation is of the exact minimum weight triangulation.

4. CONCLUSIONS AND OTHER RESOURCES

We have suggested several topics from computational geometry that can be incorporated into a traditional algorithms course. For the two authors of this paper, the use of computation geometry has proven very effective. Students have found the material interesting, especially when motivated with applications. They have also enjoyed the visual nature of the material, and have been successful implementing several algorithms. We have been especially pleased with the material on convex hulls, on line sweeping to find segment intersections, and on the closest pair problems, while for the second author the k-d tree examples has been a good algorithm for student implementation. It is worth noting that the [CLR90] algorithms text chapter on Computational Geometry discusses two of these three problems. Another area of success has been interesting students in original research. Readers are referred to [DPBD98] for further details and to [DMM95,DKM97] for examples.

Again we are not suggesting that all of these algorithms or problems be discussed in a single course, as that could easily fill the entire course. Rather, these are just some example topics that might strengthen such a course. One might choose some subset of these or other examples from computational geometry for the algorithms course, while teaching the remainder of the course using a traditional approach (usually based on graph algorithms).

We acknowledge that for somebody outside the field of computational geometry, it may seem a daunting task to modify an existing syllabus, especially to add new material that is not found in the textbook and that is outside oneís own area. However we believe that there are numerous benefits to the incorporation of this material, and that it will make the course stronger. There are several resources that may be of use, including many excellent textbooks on computational geometry with material appropriate for the undergraduate level, as well as some useful on-line resources. Below is a list of some of the available books and websites. The bibliography also contains some selected papers.

4.1 textbooks in computational geometry

The following are textbooks in computational geometry, with material relevant to an undergraduate audience. In all of the books, there is self-contained material which could be used in an algorithms course. In particular, they provide additional detail on many of the subjects described in this paper.

[BKOS97] M. de Berg, M. van Kreveld, M.Overmars, and O.Scwarzkopf, Computational Geometry: algorithmics and applications (Springer-Verlag, 1997).

[OíRo94] J. OíRourke, Computational Geometry in C (Cambridge, 1994).

[PS85] F. Preparata and M. Shamos, Computational Geometry: an introduction (Springer-Verlag, 1985).

[OBS92] A. Okabi, B. Boots, and K. Sugihara, Spatial Tesselations: Concepts and Applications of Voronoi Diagrams (Wiley & Sons, 1992).

Many of the algorithms discussed in Section 3 are explained in the previous mentioned textbooks on computational geometry (See Section 4.1). For most people teaching an algorithms course, we expect these introductory computational geometry texts to be the most useful source for further information on these algorithms, including the appropriate level of detail for the instructor of an undergraduate algorithms course. For the interested reader, however, we include at the end of the paper a short bibliography with some of the research papers referenced in this work. This is far from extensive, presenting only a small sampling of original solutions to the problems described above.

4.2 useful websites

In addition to these texts, there are several highly useful on-line resources in computational geometry. On these pages you can find on-line animations of important ideas in computational geometry as well as down-loadable software. Additionally, there are many course home-pages with syllabi and lecture notes not only for classes specifically in computational geometry but for classes in other areas (graphics, geographic information systems, and algorithms) that make use of algorithms from geometry. Here are some of the more general pages from which one can find links to others.

Nina Amenta's Directory of CG Software provides links to dozens of sites where one can download code implementing many of the important algorithms in computational geometry. Found at http://www.geom.umn.edu/software/cglist/

Jeff Erickson's Computational Geometry Page found at: http://www.cs.duke.edu/~jeffe/compgeom/ and David Eppstein's Geometry in Action page found at: http://www.ics.uci.edu/~eppstein/geom.html both provide links to dozens of resources in computational geometry, including to the home pages of geometers, interactive geometry sites, and to other computational geometry pages.

Of particular note is Erickson's page on Computational Geometry Course Materials at: http://www.cs.duke.edu/~jeffe/compgeom/courses.html

REFERENCES

[Au91] F.Aurenhammer, "Voronoi diagrams: a survey of a fundamental geometric data structure." ACM Computing Surveys 23 (1991) 345-405.

[BO79] J.L.Bentley and T.A.Ottmann, "Algorithms for reporting and counting geometric intersections." IEEE Transactions on Computing 28 (1979) 643--647.

[BS76] J.L.Bentley and M.I.Shamos, "Divide and conquer in multi-dimensional space." Proceedings of the 8th Annual Symposium on Theory of Computing (1976) 220--230.

[CLR90] T.Cormen, C.Leiserson, and R.Rivest, Introduction to Algortihms. (McGraw Hill 1990).

[DDMW97] M.Dickerson, R.L.Drysdale, S.McElfresh, and E.Welzl, "Fast greedy triangulation algorithms." Computational Geometry: theory and applications 8 (1997) 67-86.

[DKM97] M.Dickerson, J.M.Keil, and M.Montague, "A large subgraph of the minimum weight triangulation." Discrete and Computational Geometry 18 (1997) 289-305.

[DMM95] M.Dickerson, M.Montague, and S.McElfresh, "New algorithms and empirical findings on minimum weight triangulation heuristics." Proceedings of the 11th Annual Symposium on Theory of Computing (1995) 238--247.

[DPBD98] C.Dima, G.Parent, A.Briggs, and M.Dickerson, "An Undergraduate Research Project in Computational Geometry", Proceedings of the 3rd Northestern Conference for the Consortium for Computing at Small Colleges (CCSCNEí98).

[Gi79] P.Gilbert, "New results in planar triangulations," MS Thesis, University of Illinois, Urbana, IL (1979).

[Gr72] R.L.Graham, "An efficient algorithm for determining the convex hull of a finite planar set." Information Processing Letters 1 (1972) 132--133.

[Ja73] R.A.Jarvis, "On the identification of the convex hull of a finite set of points in the plane." Information Processing Letters 2 (1973) 18--21.

[Ki83] D.G.Kirkpatrick, "Optimal search in planar subdivisions." SIAM Journal of Computing 12 (1983) 28--35.

[PS91] J.Pach and M.Sharir, "On vertical visibility in arrangements of segments and the queue size in the Bentley-Ottman line sweeping algorithms." SIAM Journal of Computing 20 (1991) 460--470.

[Sa90a] H.Samet, Applications of Spatial Data Structures: Computer Graphics, Image Processing, and GIS. Addison-Wesley, 1990.

[Sa90b] H.Samet, The Design and Analysis of Spatial Data Structures, Addison-Wesley, 1990.

[Wa94] C.A.Wang, "An optimal algorithm for greedy triangulation of a set of points." Proceedings of the 6h Canadian Conference on Computational Geometry (1994) 332-338.