A national mapping agency now produces between 30 and 100 terabytes of LiDAR every year, and the format this annual harvest is written in decides whether the data will still be queryable in five years or stranded behind an obsolete tool chain. The decision sounds bureaucratic until one tries to reach into a 50 GB ALS tile to compute a single statistic and discovers that without a spatial index the query takes hours, because every algorithm that touches a point's neighbourhood is quadratic in the worst case, and on a hundred million points that means distance computations. A multi-station TLS campaign tells the same story from the terrestrial side: a dozen stations arrive in the scanner-native E57 format, are converted to LAS for the downstream tool chain, and in that conversion a quiet decision is made about which attributes survive the round trip and which silently disappear. I have watched point cloud projects fail at this stage more often than at any deep methodological step that follows, and the failure usually surfaces weeks later, when a clean experimental result proves unreproducible because the data was rewritten through a lossy converter no one logged. We open by laying out the major formats (LAS, LAZ, COPC, E57, PLY, PCD, EPT) and then turn to the indexing structures (kd-tree, octree, R-tree, grid, ball tree) that make billion-point queries feasible.
These two themes, storage and retrieval, structure the chapter. We treat the file formats first and the spatial indexing structures second, because a format is the thing one receives and an index is the thing one builds on top of it.
You have just read the opening section. The rest of the chapter is part of the paid book. Get the book to read it here now and receive the complete PDF on release, or get the bundle to add the companion video course as it ships.
Anything you highlight or note in the free chapters stays with you after you buy. Already bought? Sign in to unlock.