Preparing raw LiDAR for efficient (online) distribution

On August 14th, Prof. David Pyle tweeted about raw LiDAR being publically available for the vulcanic island Nea Kameni and its little sibling Palea Kameni that are part of the Santorini caldera in Greece. A big “kudos” to all those like Prof. Pyle who share ‎‎raw LiDAR‬ data online for all to download – be it for transparent research, as an open data policy, or to enable innovation. Although fancy download portals like OpenTopography are great, already a folder full of files accessible via simple FTP or HTTP is an incredible resource. For the latter, here are some tips from rapidlasso how to prepare your raw LiDAR flightlines with LAStools so that they are as good as they can be for those that download them … (-:

The twelve LiDAR flight lines for the Kameni islands are a perfectly-sized example on how to prepare raw LiDAR for efficient (online) distribution. They are provided both in the LAS format as well as simple ASCII files. Stored in the LAS format, the twelve strips (1,2,3,4,5,6,7,8,9,10,11,12)  are about 1 Gigabyte:

lidar_for_download_santorini_size_raw

A lasinfo report tells us the following:
(1) The exporting software used the wrong GeoTIFF tag to specify the UTM 35 (north) projection via EPSG code 32635.
(2) There are four proprietary “LeicaGeo” VLRs totalling 22000 bytes stored in each header. Does someone know what they contain and how to read them?
(3) There are two legacy bytes following the header that are part of the (now somewhat dated) LAS 1.0 specification.
(4) The coordinates are stored with millimeter resolution (i.e. the scale factors are 0.001). This is an overkill for airborne LiDAR. Those millimeters are just scanning noise, are miss-leading, and negatively affect compression.
(5) The file does not store flight line numbers in the “file source ID” field.

lidar_for_download_santorini_lasinfo_raw

Visualizing the twelve flight lines with lasview further illustrates:
(1) A cruise ship and a cloud was captured. the latter is classified as noise (7).
(2) An intensity value is stored for each return.
(3) Almost all laser shots resulted in a single return, which is not surprising for a vulcanic island without vegetation. The multi-returns from the cloud are a colorful exception.
(4) There are some strange (=> useless) numbers stored in the “point source ID” field of each point that should really store the flight line number of each point.

We suggest to fix up this raw LiDAR to make it more efficient and useful for those receiving it as follows. We run las2las to do the following:
(1) Remove the legacy two bytes following the header.
(2) Switch to the 1.2 version of the LAS format.
(3) Remove all existing VLRs.
(4) Set the horizontal projection to EPSG code 32635 and the vertical datum to WGS84.
(5) Set the file source ID of each file and the point source ID of each point to the same value. This values starts at one and is incremented with each file.
(6) Rescale coordinate resolution to centimeters.
(7) Append the string ‘_cm’ to the original file name to output LASzip-compressed files that end in ‘*.laz’.

las2las -i *.LAS ^
        -remove_extra ^
        -set_version 1.2 ^
        -remove_all_vlrs ^
        -epsg 32635 -vertical_wgs84 ^
        -files_are_flightlines ^
        -rescale 0.01 0.01 0.01 ^
        -odix _cm -olaz

This result in much much smaller files that are easier to host and faster to download. We achieve almost a factor 10 in compression as the new files are only 11.2 % of the size of the original files.

lidar_for_download_santorini_size_fixed

We also get a much cleaner raw LiDAR file that has only meaningful VLRs, more suitable coordinate resolution, and properly populated flight line information:

lidar_for_download_santorini_lasinfo_fixed

And we now have flightline information that tells us, for example, that the (blue) cloud was captured by the third flight line.

lidar_for_download_santorini_05

We noticed something odd when loading the files with lasview: Although most flightlines were still in acquisition order (i.e. the points in the file are in the order acquired by the scanner), flightline 8 and 10 were not. Perfectionists like us employ lassort with option ‘-gps_time’ and reorder all twelve files by GPS time on 4 cores in parallel.

lassort -i LDR*cm.laz ^
        -gps_time ^
        -odix _sort -olaz ^
        -cores 4

Several calls to lasdiff – a tool that reports any content and order difference between two files – confirm that only points of flightline 8 and 10 are actually reordered by the above call. Here two example calls:

lidar_for_download_santorini_lasdiff_sorted

One – rather new – thing that we are now recommend doing is to also add spatial indexing information and do this by storing the little LAX files directly into the compressed LAZ files. This can be done in-place with

lasindex -i LDR*cm.laz ^
         -append ^
         -cores 8

Such spatial indexing information (see the video of our ELMF 2012 talk for more details) allows faster spatial queries that only read and decompress the actually queried area-of-interest. This is already used by all LAStools and can be exploited via the LASlib application programming interface. Soon there will also be support for spatially-indexed queries in the LASzip compression DLL.

lidar_for_download_santorini_06

Finally, should you have ortho photo imagery in TIF format (like it was kindly provided here by Prof. David Pyle) then you can embellish your LiDAR points with RGB using lascolor as shown here:

lascolor -i LDR*_cm.laz ^
         -image KameniOrtho.tif ^
         -odix _col -olaz ^
         -cores 4

Now you have raw LiDAR strips for online distribution that are as good as they get … (-:

lidar_for_download_santorini_07

2 thoughts on “Preparing raw LiDAR for efficient (online) distribution

  1. Hi Martin, thanks for the note. Is there an easy (using LAStools of course) way to tell whether adjacent flight lines flew opposing directions or not? Btw, I used the DQM tool we developed and analyzed the data and found residual roll errors

    • Loading all twelve flightlines into lasview will make this obvious due to the order the points are drawn. To do this just run “lasview -i LDR*.las”. Of course, the flightlines need to be GPS-time sorted and most of them are (except strip 8 and 10).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s