I was thinking of building an interactive visualization of mountain prominence, by progressing down the contour lines till the current contour line encircles a peak that is taller than the one that I started with.
I think Quadtrees will come handy in this visualization, if a precomputed list of all the peaks were available.
Nice and concise description of quadtrees implemented with classical pointer chasing data structures.
A faster and (arguably) simpler way to construct quad/octrees is using morton codes, sorting and searching with flat arrays [0]. It's also probably easier to implement.
The gist of it is that you quantize the coordinates, do bit interleaving to construct morton code and then sort. The sorting is using numerical keys so you can use a radix sort for O(n) complexity which is much faster on a GPU (but on single CPU core a comparison based sort will probably win if the array isn't huge).
Now everything on the top half of the space is in the beginning of the array and the bottom half is in the end of the array (assuming yxyxyxyx morton code bits). Each of those halves is then split left/right and then each level below that alternates between vertical and horizontal splits.
To find the split point in the array, look at the first and last entry of the (sub)array you're looking at, and look for the first differing bit with (first ^ last).leading_zeros(). Then binary search for the first entry where that bit is high.
To traverse the quad/octree, repeat this process with the two halves you found. This can be done without recursion in O(1) memory using a fixed size stack because you know the depth of the "recursion" is at most half the number of bits in the morton code.
If you used radix sorting for building the array, you can avoid the binary search if you store the histograms from the counting phase of the sorting. Storing the whole histogram may be too much, but just a few highest order bits can already help.
Although I've found experimentally that for small (less that 10000 objects) just sorting and searching is faster if the whole array fits in L2 cache. On my 2015 laptop a single core can sort 10k objects with 64 bit keys in 1 millisecond. Traversing the tree for frustum culling is about 5x faster than just going through the entire array because a lot of geometry can be discarded very quickly.
With a good comparison based sort rebuilding the array after some changes is mighty fast because modern sorting algorithms are much faster for "almost sorted" inputs.
For range queries ("Find everything in the region" in the article), you can probably get better performance by using the BIGMIN/LITMAX method [1].
Now here's a brain teaser to delight your Friday: the article and the method I describe above is for storing points/centroids, but often you need to store axis aligned boxes instead (AABB). There is a very clever trick for this that I discovered independently but later found in some research papers. Can you come up with a (very) small change in the algorithm above to extend it to AABBs instead of centroids?
I was thinking of building an interactive visualization of mountain prominence, by progressing down the contour lines till the current contour line encircles a peak that is taller than the one that I started with.
I think Quadtrees will come handy in this visualization, if a precomputed list of all the peaks were available.
A faster and (arguably) simpler way to construct quad/octrees is using morton codes, sorting and searching with flat arrays [0]. It's also probably easier to implement.
The gist of it is that you quantize the coordinates, do bit interleaving to construct morton code and then sort. The sorting is using numerical keys so you can use a radix sort for O(n) complexity which is much faster on a GPU (but on single CPU core a comparison based sort will probably win if the array isn't huge).
Now everything on the top half of the space is in the beginning of the array and the bottom half is in the end of the array (assuming yxyxyxyx morton code bits). Each of those halves is then split left/right and then each level below that alternates between vertical and horizontal splits.
To find the split point in the array, look at the first and last entry of the (sub)array you're looking at, and look for the first differing bit with (first ^ last).leading_zeros(). Then binary search for the first entry where that bit is high.
To traverse the quad/octree, repeat this process with the two halves you found. This can be done without recursion in O(1) memory using a fixed size stack because you know the depth of the "recursion" is at most half the number of bits in the morton code.
If you used radix sorting for building the array, you can avoid the binary search if you store the histograms from the counting phase of the sorting. Storing the whole histogram may be too much, but just a few highest order bits can already help.
Although I've found experimentally that for small (less that 10000 objects) just sorting and searching is faster if the whole array fits in L2 cache. On my 2015 laptop a single core can sort 10k objects with 64 bit keys in 1 millisecond. Traversing the tree for frustum culling is about 5x faster than just going through the entire array because a lot of geometry can be discarded very quickly.
With a good comparison based sort rebuilding the array after some changes is mighty fast because modern sorting algorithms are much faster for "almost sorted" inputs.
For range queries ("Find everything in the region" in the article), you can probably get better performance by using the BIGMIN/LITMAX method [1].
Now here's a brain teaser to delight your Friday: the article and the method I describe above is for storing points/centroids, but often you need to store axis aligned boxes instead (AABB). There is a very clever trick for this that I discovered independently but later found in some research papers. Can you come up with a (very) small change in the algorithm above to extend it to AABBs instead of centroids?
[0] Karras: Maximizing Parallelism in the Construction of BVHs, Octrees, and k-d Trees - https://research.nvidia.com/sites/default/files/pubs/2012-06... [1] https://en.wikipedia.org/wiki/Z-order_curve#Use_with_one-dim...