Plots to Stands: Producing LiDAR Vegetation Metrics for Imputation Calculations

Some professionals in remote sensing find LAStools a useful tool to extract statistical metrics from LiDAR that are used to make estimations about a larger area of land from a small set of sample plots. Common applications are prediction of the timber volume or the above-ground biomass for entire forests based on a number of representative plots where exact measurements were obtained with field work. The same technique can also be used to make estimations about animal habitat or coconut yield or to classify the type of vegetation that covers the land. In this tutorial we describe the typical workflow for computing common metrics for smaller plots and larger areas using LAStools.

Download these six LiDAR tiles (1, 2, 3, 4, 5, 6) from a Eucalyptus plantation in Brazil to follow along the step by step instructions of this tutorial. This data is courtesy of Suzano Pulp and Paper. Please also download the two shapefiles that delineate the plots where field measurements were taken and the stands for which predictions are to be made. You should download version 170327 (or higher) of LAStools due to some recent bug fixes.

Quality Checking

Before processing newly received LiDAR data we always perform a quality check first. This ranges from visual inspection with lasview, to printing textual content reports and attribute histograms with lasinfo, to flight-line alignment checks with lasoverlap, pulse density and pulse spacing checks with lasgrid and las2dem, and completeness-of-returns check with lassort followed by lasreturn.

lasinfo -i tiles_raw\CODL0003-C0006.laz ^
        -odir quality -odix _info -otxt

The lasinfo report tells us that there is no projection information. However, we remember that this Brazilian data was in the common SIRGAS 2000 projection and try for a few likely UTM zones whether the hillshaded DSM produced by las2dem falls onto the right spot in Google Earth.

las2dem -i tiles_raw\CODL0003-C0006.laz ^
        -keep_first -thin_with_grid 1 ^
        -hillshade -epsg 31983 ^
        -o epsg_check.png

Hillshaded DSM and Google Earth imagery align for EPSG code 31983

The lasinfo report also tells us that the xyz coordinates are stored with millimeter resolution which is a bit of an overkill. For higher and faster LASzip compression we will later lower this to a more appropriate centimeter resolution. It further tells us that the returns are stored using point type 0 and that is a bit unfortunate. This (older) point type does not have a GPS time stamp so that some quality checks (e.g. “completeness of returns” with lasreturn) and operations (e.g. “resorting of returns into acquisition order” with lassort) will not be possible. Fortunately the min-max range of the ‘point source ID’ suggests that this point attribute is correctly populated with flightline numbers so that we can do a check for overlap and alignment of the different flightlines that contribute to the LiDAR in each tile.

lasoverlap -i tiles_raw\*.laz ^
           -min_diff 0.2 -max_diff 0.4 ^
           -epsg 31983 ^
           -odir quality -opng ^
           -cores 3

We run lasoverlap to visualize the amount of overlap between flightlines and the vertical differences between them. The produced images (see below) color code the number of flightlines and the maximum vertical difference between any two flightlines as seen below. Most of the area is cyan (2 flightlines) except in the bottom left where the pilot was sloppy and left some gaps in the yellow seams (3 flightlines) so that some spots are only blue (1 flightline). We also see that two tiles in the upper left are partly covered by a diagonal flightline. We will drop that flightline later to create a more uniform density.across the tiles. The mostly blue areas in the difference are mostly aligned with features in the landscape and less with the flightline pattern. Closer inspection shows that these vertical difference occur mainly in the dense old growth forests with species of different heights that are much harder to penetrate by the laser than the uniform and short-lived Eucalyptus plantation that is more of a “dead forest” with little undergrowth or animal habitat.

Interesting observation: The vertical difference of the lowest return from different flightlines computed per 2 meter by 2 meter grid cell could maybe be used a new forestry metric to help distinguish mono cultures from natural forests.

lasgrid -i tiles_raw\*.laz ^
        -keep_last ^
        -step 2 -point_density ^
        -false -set_min_max 10 20 ^
        -epsg 31983 ^
        -odir quality -odix _d_2m_10_20 -opng ^
        -cores 3

lasgrid -i tiles_raw\*.laz ^
        -keep_last ^
        -step 5 -point_density ^
        -false -set_min_max 10 20 ^
        -epsg 31983 ^
        -odir quality -odix _d_5m_10_20 -opng ^
        -cores 3

We run lasgrid to visualize the pulse density per 2 by 2 meter cell and per 5 by 5 meter cell. The produced images (see below) color code the number of last return per square meter. The impact of the tall Eucalyptus trees on the density per cell computation is evident for the smaller 2 meter cell size in form of a noisy blue/red diagonal in the top right as well as a noisy blue/red area in the bottom left. Both of those turn to a more consistent yellow for the density per cell computation with larger 5 meter cells. Immediately evident is the higher density (red) for those parts or the two tiles in the upper left that are covered by the additional diagonal flightline. The blueish area left to the center of the image suggests a consistently lower pulse density whose cause remains to be investigated: Lower reflectivity? Missing last returns? Cloud cover?

The lasinfo report suggests that the tiles are already classified. We could either use the ground classification provided by the vendor or re-classify the data ourselves (using lastilelasnoise, and lasground). We check the quality of the ground classification by visually inspecting a hillshaded DTM created with las2dem from the ground returns. We buffer the tiles on-the-fly for a seamless hillshade without artifacts along tile boundaries by adding ‘-buffered 25’ and ‘-use_orig_bb’ to the command-line. To speed up reading the 25 meter buffers from neighboring tiles we first create a spatial indexing with lasindex.

lasindex -i tiles_raw\*.laz ^
         -cores 3

las2dem -i tiles_raw\*.laz ^
        -buffered 25 ^
        -keep_class 2 -thin_with_grid 0.5 ^
        -use_orig_bb ^
        -hillshade -epsg 31983 ^
        -odir quality -odix _dtm -opng ^
        -cores 3

hillshaded DTM tiles generated with las2dem and on-the-fly buffering

The resulting hillshaded DTM shows a few minor issues in the ground classification but also a big bump (above the mouse cursor). Closer inspection of this area (you can cut it from the larger tile using the command-line below) shows that there is a group of miss-classified points about 1200 meters below the terrain. Hence, we will start from scratch to prepare the data for the extraction of forestry metrics.

las2las -i tiles_raw\CODL0004-C0006.laz ^
        -inside_tile 207900 7358350 100 ^
        -o bump.laz

lasview -i bump.laz

bump in hillshaded DTM caused by miss-classified low noise

Data Preparation

Using lastile we first tile the data into smaller 500 meter by 500 meter tiles with 25 meter buffer while flagging all points in the buffer as ‘withheld’. In the same step we lower the resolution to centimeter and put nicer a coordinate offset in the LAS header. We also remove the existing classification and classify all points that are much lower than the target terrain as class 7 (aka noise). We also add CRS information and give each tile the base name ‘suzana.laz’.

lastile -i tiles_raw\*.laz ^
        -rescale 0.01 0.01 0.01 ^
        -auto_reoffset ^
        -set_classification 0 ^
        -classify_z_below_as 500.0 7 ^
        -tile_size 500 ^
        -buffer 25 -flag_as_withheld ^
        -epsg 31983 ^
        -odir tiles_buffered -o suzana.laz

With lasnoise we mark the many isolated points that are high above or below the terrain as class 7 (aka noise). Using two tiles we played around with the ‘step’ parameters until we find good parameter settings. See the README of lasnoise for the exact meaning and the choice of parameters for noise classification. We look at one of the resulting tiles with lasview.

lasnoise -i tiles_buffered\*.laz ^
         -step_xy 4 -step_z 2 ^
         -odir tiles_denoised -olaz ^
         -cores 3

lasview -i tiles_denoised\suzana_206000_7357000.laz ^
        -color_by_classification ^
        -win 1024 192

noise points shown in pink: all points (top), only noise points (bottom)

Next we use lasground to classify the last returns into ground (2) and non-ground (1). It is important to ignore the noise points with classification 7 to avoid the kind of bump we saw in the vendor-delivered classification. We again check the quality of the computed ground classification by producing a hillshaded DTM with las2dem. Here the las2dem command-line is sightly different as instead of buffering on-the-fly we use the buffers stored with each tile.

lasground -i tiles_denoised\*.laz ^
          -ignore_class 7 ^
          -nature -extra_fine ^
          -odir tiles_ground -olaz ^
          -cores 3

las2dem -i tiles_ground\*.laz ^
        -keep_class 2 -thin_with_grid 0.5 ^
        -hillshade ^
        -use_tile_bb ^
        -odir quality -odix _dtm_new -opng ^
        -cores 3

Finally, with lasheight we compute how high each return is above the triangulated surface of all ground returns and store this height value in place of the elevation value into the z coordinate using the ‘-replace_z’ switch. This height-normalizes the LiDAR in the sense that all ground returns are set to an elevation of 0 while all other returns get an elevation relative to the ground. The result are height-normalized LiDAR tiles that are ready for producing the desired forestry metrics.

lasheight -i tiles_ground\*.laz ^
          -replace_z ^
          -odir tiles_normalized -olaz ^
          -cores 3
Metric Production

The tool for computing the metrics for the entire area as well as for the individual field plots is lascanopy. Which metrics are well suited for your particular imputation calculation is your job to determine. Maybe first compute a large number of them and then eliminate the redundant ones. Do not use any point from the tile buffers for these calculations. We had flagged them as ‘withheld’ during the lastile operation, so they are easy to drop. We also want to drop the noise points that were classified as 7. And we were planning to drop that additional diagonal flightline we noticed during quality checking. We generated two lasinfo reports with the ‘-histo point_source 1’ option for the two tiles it was covering. From the two histograms it was easy to see that the diagonal flightline has the number 31.

First we run lascanopy on the 11 plots that you can download here. When running on plots it makes sense to first create a spatial indexing with lasindex for faster querying so that only those tiny parts of the LAZ file need to be loaded that actually cover the plots.

lasindex -i tiles_normalized\*.laz ^
         -cores 3

lascanopy -i tiles_normalized\*.laz -merged ^
          -drop_withheld ^
          -drop_class 7 ^
          -drop_point_source 31 ^
          -lop WKS_PLOTS.shp ^
          -cover_cutoff 3.0 ^
          -cov -dns ^
          -height_cutoff 2.0 ^
          -c 2.0 999.0 ^
          -max -avg -std -kur ^
          -p 25 50 75 95 ^
          -b 30 50 80 ^
          -d 2.0 5.0 10.0 50.0 ^
          -o plots.csv

The resulting ‘plots.csv’ file you can easily process in other software packages. It contains one line for each polygonal plot listed in the shapefile that lists its bounding box followed by all the requested metrics. But is why is there a zero maximum height (marked in orange) for plots 6 though 10? All height metrics are computed solely from returns that are higher than the ‘height_cutoff’ that was set to 2 meters. We added the ‘-c 2.0 999.0’ absolute count metric to keep track of the number of returns used in these calculations. Apparently in plots 6 though 10 there was not a single return above 2 meters as the count (also marked in orange) is zero for all these plots. Turns out this Eucalyptus stand had recently been harvested and the new seedlings are still shorter than 2 meters.

more plots.csv
index,min_x,min_y,max_x,max_y,max,avg,std,kur,p25,p50,p75,p95,b30,b50,b80,c00,d00,d01,d02,cov,dns
0,206260.500,7358289.909,206283.068,7358312.477,11.23,6.22,1.91,2.26,4.71,6.01,7.67,9.5,26.3,59.7,94.2,5359,18.9,41.3,1.5,76.3,60.0
1,206422.500,7357972.909,206445.068,7357995.477,13.54,7.5,2.54,1.97,5.32,7.34,9.65,11.62,26.9,54.6,92.2,7030,12.3,36.6,13.3,77.0,61.0
2,206579.501,7358125.909,206602.068,7358148.477,12.22,5.72,2.15,2.5,4.11,5.32,7.26,9.76,46.0,73.7,97.4,4901,24.8,28.7,2.0,66.8,51.2
3,206578.500,7358452.910,206601.068,7358475.477,12.21,5.68,2.23,2.64,4.01,5.14,7.18,10.04,48.3,74.1,95.5,4861,25.7,26.2,2.9,68.0,50.2
4,206734.501,7358604.910,206757.068,7358627.478,15.98,10.3,2.18,2.64,8.85,10.46,11.9,13.65,3.3,27.0,91.0,4946,0.6,32.5,44.5,91.0,77.5
5,207043.501,7358761.910,207066.068,7358784.478,15.76,10.78,2.32,3.43,9.27,11.03,12.49,14.11,3.2,20.7,83.3,4819,1.5,24.7,51.0,91.1,76.8
6,207677.192,7359630.526,207699.760,7359653.094,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
7,207519.291,7359372.366,207541.859,7359394.934,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
8,207786.742,7359255.850,207809.309,7359278.417,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
9,208159.017,7358997.344,208181.584,7359019.911,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
10,208370.909,7358602.565,208393.477,7358625.133,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0

Then we run lascanopy on the entire area and produce one raster per tile for each metric. Here we remove the buffered points with the ‘-use_tile_bb’ switch that also ensures that the produced rasters have exactly the extend of the tiles without buffers. Again, it is imperative that you download the version 170327 (or higher) of LAStools for this to work correctly.

lascanopy -version
LAStools (by martin@rapidlasso.com) version 170327 (academic)

lascanopy -i tiles_normalized\*.laz ^
          -use_tile_bb ^
          -drop_class 7 ^
          -drop_point_source 31 ^
          -step 10 ^
          -cover_cutoff 3.0 ^
          -cov -dns ^
          -height_cutoff 2.0 ^
          -c 2.0 999.0 ^
          -max -avg -std -kur ^
          -p 25 50 75 95 ^
          -b 30 50 80 ^
          -d 2.0 5.0 10.0 50.0 ^
          -odir tile_metrics -oasc ^
          -cores 3

The resulting rasters in ASC format can easily be previewed using lasview for some “sanity checking” that our metrics make sense and to get a quick overview about what these metrics look like.

lasview -i tile_metrics\suzana_*max.asc
lasview -i tile_metrics\suzana_*p95.asc
lasview -i tile_metrics\suzana_*p50.asc
lasview -i tile_metrics\suzana_*p25.asc
lasview -i tile_metrics\suzana_*cov.asc
lasview -i tile_metrics\suzana_*d00.asc
lasview -i tile_metrics\suzana_*d01.asc
lasview -i tile_metrics\suzana_*b30.asc
lasview -i tile_metrics\suzana_*b80.asc

The maximum height rasters are useful to inspect more closely as they will immediately tell us if there was any high noise point that slipped through the cracks. And indeed it happened as we see a maximum of 388.55 meters for of the 10 by 10 meter cells. As we know the expected height of the trees we could have added a ‘-drop_z_above 70’ to the lascanopy command line. Careful, however, when computing forestry metrics in strongly sloped terrains as the terrain slope can significantly lift up returns to heights much higher than that of the tree. This is guaranteed to happen for LiDAR returns from branches that are extending horizontally far over the down-sloped part of the terrain as shown in this paper here.

We did not use the shapefile for the stands in this exercise. We could have clipped the normalized LiDAR points to these stands using lasclip as shown in the GUI below before generating the raster metrics. This would have saved space and computation time as many of the LiDAR points lie outside of the stands. However, it might be better to do that clipping step on the rasters in whichever GIS software or statistics package you are using for the imputation computation to properly account for partly covered raster cells along the stand boundary. This could be subject of another blog article … (-:

not all LiDAR was needed to compute metrics for

NRW Open LiDAR: Merging Points into Proper LAS Files

In the first part of this series we downloaded, compressed, and viewed some of the newly released open LiDAR data for the state of North Rhine-Westphalia. In the second part we look at how to merge the multiple point clouds provided back into single LAS or LAZ files that are as proper as possible. Follow along with a recent version of LAStools and a pair of DGM and DOM files for your area of interest. For downloading the LiDAR we suggest using the wget command line tool with option ‘-c’ that after interruption in transmission will restart where it left off.

In the first part of this series we downloaded the pair of DGM and DOM files for the City of Bonn. The DGM file and the DOM file are zipped archives that contain the points in 1km by 1km tiles stored as x, y, z coordinates with centimeter resolution. We had already converted these textual *.xyz files into binary *.laz files. We did this with the open source LASzip compressor that is distributed with LAStools as described in that blog post. We continue now with the assumption that you have converted all of the *.xyz files to *.laz files as described here.

Mapping from tile names of DGM and DOM archives to classification and return type of points.

The mapping from tile names in DGM and DOM archives to the classification and return type of points: lp = last return. fp = first return, ab,aw,ag = synthetic points

There are multiple tiles for each square kilometer as the LiDAR has been split into different files based on classification and return type. Furthermore there are also synthetic points that were created by the land survey department to replace LiDAR under bridges and along buildings for generating higher quality rasters. We want to combine all points of a square kilometer into a single LAZ tile as it is usually expected. Simply merging the multiple files per tile is not an option as this would result in loosing point classifications and return type information as well as in duplicating all single returns that are stored in more than one file. The folks at OpenNRW offer this helpful graphic to know what the acronyms above mean:

Illustration of how acronyms used in tile names correspond to point classification and type.

Illustration of how acronyms used in tile names correspond to point classification and type.

In the following we’ll be looking at the set of files corresponding to the UTM tile 32366 / 5622. We wanted an interesting area with large buildings, a bridge, and water. We were looking for a suitable area using the KML overlays generated in part one. The tile along the Rhine river selected in the picture below covers the old city hall, the opera house, and the “Kennedy Bridge” has a complete set of DGM and DOM files:

      3,501 dgm1l-ab_32366_5622_1_nw.laz
     16,061 dgm1l-ag_32366_5622_1_nw.laz
      3,269 dgm1l-aw_32366_5622_1_nw.laz
    497,008 dgm1l-brk_32366_5622_1_nw.laz
  7,667,715 dgm1l-lpb_32366_5622_1_nw.laz
 12,096,856 dgm1l-lpnb_32366_5622_1_nw.laz
     15,856 dgm1l-lpub_32366_5622_1_nw.laz

      3,269 dom1l-aw_32366_5622_1_nw.laz
 21,381,106 dom1l-fp_32366_5622_1_nw.laz
We find the name of the tiles that cover the "Kennedy Bridge" using the KML overlays generated in part one.

We find the name of the tile that covers the “Kennedy Bridge” using the KML overlays generated in part one.

We now assign classification codes and flags to the returns from the different files using las2las, merge them together with lasmerge, and recover single, first, and last return information with lasduplicate. We set classifications to bridge deck (17), ground (2), to unclassified (1), and to noise (7) for all returns in the files with the acronym ‘brk’ (= bridge points), the acronym ‘lpb’ (= last return ground), the acronym ‘lpnb’ (= last return non-ground), and the acronym ‘lpub’ (= last return under ground). with las2las and store the resulting files to a temporary folder.

las2las -i dgm1l-brk_32366_5622_1_nw.laz ^
        -set_classification 17 ^
        -odir temp -olaz

las2las -i dgm1l-lpb_32366_5622_1_nw.laz ^
        -set_classification 2 ^
        -odir temp -olaz

las2las -i dgm1l-lpnb_32366_5622_1_nw.laz ^
        -set_classification 1 ^
        -odir temp -olaz

las2las -i dgm1l-lpub_32366_5622_1_nw.laz ^
        -set_classification 7 ^
        -odir temp -olaz

Next we use the synthetic flag of the LAS format specification to flag any additional points that were added (no measured) by the survey department to generate better raster products. We set classifications to ground (2) and the synthetic flag for all points of the files with the acronym ‘ab’ (= additional ground) and the acronym ‘ag’ (= additional building footprint). We set classifications to water (9) and the synthetic flag for all points of the files with the acronym ‘aw’ (= additional water bodies). Files with acronym ‘aw’ appear both in the DGM and DOM archive. Obviously we need to keep only one copy.

las2las -i dgm1l-ab_32366_5622_1_nw.laz ^
        -set_classification 2 ^
        -set_synthetic_flag 1 ^
        -odir temp -olaz

las2las -i dgm1l-ag_32366_5622_1_nw.laz ^
        -set_classification 2 ^
        -set_synthetic_flag 1 ^
        -odir temp -olaz

las2las -i dgm1l-aw_32366_5622_1_nw.laz ^
        -set_classification 9 ^
        -set_synthetic_flag 1 ^
        -odir temp -olaz

Using lasmerge we merge all returns from files with acronyms ‘brk’ (= bridge points), ‘lpb’ (= last return ground),  ‘lpnb’ (= last return non-ground), and ‘lpub’ (= last return under ground) into a single file that will then contain all of the (classified) last returns for this tile.

lasmerge -i temp\dgm1l-brk_32366_5622_1_nw.laz ^
         -i temp\dgm1l-lpb_32366_5622_1_nw.laz ^
         -i temp\dgm1l-lpnb_32366_5622_1_nw.laz ^
         -i temp\dgm1l-lpub_32366_5622_1_nw.laz ^
         -o temp\dgm1l-lp_32366_5622_1_nw.laz

Next we run lasduplicate three times to recover which points are single returns and which points are the first and the last return of a pair of points generated by the same laser shot. First we run lasduplicate with option ‘-unique_xyz’ to remove any xyz duplicates from the last return file. We also mark all surviving returns as the second of two returns. Similarly, we remove any xyz duplicates from the first return file and mark all survivors as the first of two returns. Finally we run lasduplicate with option ‘-single_returns’ with the unique last and the unique first return files as ‘-merged’ input. If a return with the exact same xyz coordinates appears in both files only the first copy is kept and marked as a single return. In order to keep the flags and classifications from the last return file, the order in which the last and first return files are listed in the command line is important.

lasduplicate -i temp\dgm1l-lp_32366_5622_1_nw.laz ^
             -set_return_number 2 -set_number_of_returns 2 ^
             -unique_xyz ^
             -o temp\last_32366_5622_1_nw.laz

lasduplicate -i dom1l-fp_32366_5622_1_nw.laz ^
             -set_return_number 1 -set_number_of_returns 2 ^
             -unique_xyz ^
             -o temp\first_32366_5622_1_nw.laz

lasduplicate -i temp\last_32366_5622_1_nw.laz ^
             -i temp\first_32366_5622_1_nw.laz ^
             -merged ^
             -single_returns ^
             -o temp\all_32366_5622_1_nw.laz

We then add the synthetic points with another call to lasmerge to obtain a LAZ file containing all points of the tile correctly classified, flagged, and return-numbered.

lasmerge -i temp\dgm1l-ab_32366_5622_1_nw.laz ^
         -i temp\dgm1l-ag_32366_5622_1_nw.laz ^
         -i temp\dgm1l-aw_32366_5622_1_nw.laz ^
         -i temp\all_32366_5622_1_nw.laz ^
         -o temp\merged_32366_5622_1_nw.laz

Optional: For more efficient use of this file in subsequent processing – and especially to accelerate area-of-interest queries with lasindex – it is often of great advantage to reorder the points in a spatially coherent manner. A simple call to lassort will rearrange the points along a space-filling curve such as a Hilbert curve or a Z-order curve.

lassort -i temp\merged_32366_5622_1_nw.laz ^
        -o bonn_32366_5622_1_nw.laz

Note that we also renamed the file because a good name can be useful if you find that file again in two years from now. Let’s have a look at the result with lasview.

lasview -i bonn_32366_5622_1_nw.laz

In lasview you can press <c> repeatedly to switch through all available coloring modes until you see the yellow (single) / red (first) / last (blue) coloring of the returns. The recovered return types are especially evident under vegetation, in the presence of wires, and along building edges. Press <x> to select an area of interest and press <x> again to inspect it more closely. Press <i> while hovering above a point to show its coordinates, classification, and return type.

We did each processing in separate steps to illustrate the overall workflow. The above sequence of LAStools command line calls can be shortened by combining multiple processing steps into one operation. This is left as an exercise for the advanced LAStools user … (-;

Acknowledgement: The LiDAR data of OpenNRW comes with a very permissible license. It is called “Datenlizenz Deutschland – Namensnennung – Version 2.0” or “dl-de/by-2-0” and allows data and derivative sharing as well as commercial use. It only requires us to name the source. We need to cite the “Land NRW (2017)” with the year of the download in brackets and specify the Universal Resource Identification (URI) for both the DOM and the DGM. Done. So easy. Thank you, OpenNRW … (-:

RIEGL Becomes LASzip Sponsor for LAS 1.4 Extension

PRESS RELEASE (for immediate release)
August 31, 2015
rapidlasso GmbH, Gilching, Germany

We are happy to announce that RIEGL Laser Measurement Systems, Austria has become a sponsor of the award-winning LASzip compressor. Their contribution at the Silver level will kick-off the actual development phase of the “native LAS 1.4 extension” that had been discussed with the LiDAR community over the past two years. This “native extension” for LAS 1.4 complements the existing “compatibility mode” for LAS 1.4 that was supported by Gold sponsor NOAA and Bronze sponsors Quantum Spatial and Trimble Geospatial. The original sponsor who initiated and financed the open sourcing of the LASzip compressor was USACE – the US Army Corps of Engineers (see http://laszip.org).

The existing “LAS 1.4 compatibility mode” in LASzip was created to provide immediate support for compressing the new LAS 1.4 point types by rewriting them as old point types and storing their new information as “Extra Bytes”. As an added side-benefit this has allowed legacy software without LAS 1.4 support to readily read these newer LAS files as most of the important fields of the new point types 6 to 10 can be mapped to fields of the older point types 1, 3, or 5.

In contrast, the new “native LAS 1.4 extension” of LASzip that is now sponsored in part by RIEGL will utilize the “natural break” in the format due to the new point types of LAS 1.4 to introduce entirely new features such as “selective decompression”, “rewritable classifications and flags”, “integrated spatial indexing”, … and other functionality that has been brain-stormed with the community since rapidlasso GmbH had issued the open “call for input” on native LASzip compression for LAS 1.4 in January 2014. We invite you to follow the progress or contribute to the development via the discussions in the “LAS room“.

silverLASzip_m60_512_275

About rapidlasso GmbH:
Technology powerhouse rapidlasso GmbH specializes in efficient LiDAR processing tools that are widely known for their high productivity. They combine robust algorithms with efficient I/O and clever memory management to achieve high throughput for data sets containing billions of points. The company’s flagship product – the LAStools software suite – has deep market penetration and is heavily used in industry, government agencies, research labs, and educational institutions. Visit http://rapidlasso.com for more information.

About RIEGL:
Austrian based RIEGL Laser Measurement Systems is a performance leader in research, development and production of terrestrial, industrial, mobile, bathymetric, airborne and UAS-based laser scanning systems. RIEGL’s innovative hard- and software provides powerful solutions for nearly all imaginable fields of application. Worldwide sales, training, support and services are delivered from RIEGL‘s Austrian headquarters and its offices in Vienna, Salzburg, and Styria, main offices in the USA, Japan, and in China, and by a worldwide network of representatives covering Europe, North and South America, Asia, Australia and Africa. Visit http://riegl.com for more information.

Use Buffers when Processing LiDAR in Tiles !!!

We often process LiDAR in tiles for two reasons: first, to keep the number of points per file low and use main memory efficient, and second, to speed up the computation with parallel tile processing and keep all cores of a modern CPU busy. However, it is very (!!!) important to take the necessary precautions to avoid “edge artifacts” when processing LiDAR in tiles. We have to include points from neighboring tiles during certain LAStools processing steps to avoid edge artifacts. Why? Here is an illustration from our PHIL LiDAR tour earlier this year:

Buffers are important to avoid edge artifacts along tile boundaries during DTM creation.

Buffers are important to avoid edge artifacts along tile boundaries.

What you see is the temporary TIN of ground points created (internally) by las2dem or blast2dem that is then rastered at the user-specified step size onto a grid. Without a buffer (right side) there will not always be a triangle to cover every pixel. Especially in the corners of the tile you will often find empty pixels. Furthermore the poorly shaped “sliver triangles” along the boundary of the TIN do not interpolate the ground elevations properly. In contrast, with buffer (left side) the TIN generously covers the entire area that is to be rastered with nicely shaped triangles.

The

Christmas cookie analogy: buffers are like generously rolling out the dough

Here the christmas cookies analogy: You need to roll out the dough larger than the cookies you will cut to make sure your cookies will have nice edges. Think of the TIN as the dough and the square tile as your cookie cutter. You need to use a sufficiently large buffer when you roll out your TIN to assure an edge without crumbles when you cut out the tile … (-: … otherwise you are pretty much guaranteed to get results that – upon closer inspection – have these kind of artifacts:

Without buffers processing artifacts also happen when classifying points with lasground or lasclassify, when calculating height above ground or height-normalizing LiDAR tiles with lasheight, when removing noise with lasnoise, when creationg contours with las2iso or blast2iso, or any other operation where an incomplete neighborhood of points can affect the results. Hence, we need to surround each tile with a temporary buffer of points. Currently there are two ways of working with buffers with LAStools:

  1. creating buffered tiles during the initial tiling step with the ‘-buffer 25’ option of lastile, maintaining buffered tiles throughout processing and finally using the ‘-use_tile_bb’ option of lasgrid, las2dem, blast2dem, or lascanopy to raster the tiles without the temporary buffer.
  2. creating buffered tiles from non-overlapping (= unbuffered) tiles with “on-the-fly” buffering using the ‘-buffered 25’ option of most LAStools such as lasground, lasheight, or las2dem. For some workflows it is useful to also add ‘-remain_buffered’ if buffers are needed again in the next step. Finally, we use the ‘-use_orig_bb’ option of lasgrid, las2dem, blast2dem, or lascanopy to raster the tiles without the temporary buffer.

In the following three (tiny) examples using the venerable ‘fusa.laz’ sample that is distributed with LAStools to illustrate the two types of buffering as well as to show what happens when no buffers are used. In each example we will first cut the small ‘fusa.laz’ sample into nine smaller tiles and then process these separately on 4 cores in parallel.

1. Initial buffer creation with lastile

This is what most of my tutorials teach. It assumes you are the one creating the tiling in the first place. If you do it with lastile and add a buffer right from the start things are pretty easy.

lastile -i ..\data\fusa.laz ^
        -set_classification 0 -set_user_data 0 ^
        -tile_size 100 -buffer 20 ^
        -odir 1_raw -o futi.laz

We cut the input into 100 meter by 100 meter tiles but add a 20 meter buffer around each tile. That means that each tile on disk will contain the points for an area of up to 140 meter by 140 meter. The GUI for LAStools shows the overlap and if you scrutinize the bounding box values that the cursor points to you notice the extra 20 meters in each direction.

tiles_buffered_with_lastile

Now we can forget about the buffers and run the standard workflow consiting of lasground, lasheight, and lasclassify to distinguish ground, vegetation, and building points in the LiDAR tiles.

lasground -i 1_raw\futi*.laz ^
          -city ^
          -odir 1_ground -olaz ^
          -cores 4
lasheight -i 1_ground\futi*.laz ^
          -drop_above 50 ^
          -odir 1_height -olaz ^
          -cores 4
lasclassify -i 1_height\futi*.laz ^
            -odir 1_classify -olaz ^
            -cores 4

At the end – when we generate raster products – we have to remember that the tiles were buffered by lastile and cut off the buffers when we raster the TIN with option ‘-use_tile_bb’ of las2dem.

las2dem -i 1_classify\futi*.laz ^
        -keep_class 2 6 ^
        -step 0.25 -use_tile_bb ^
        -odir 1_dbm -obil ^
        -cores 4

We created a digital terrain model with buildings (DBM) by keeping the points with classification 2 (ground) and 6 (building). After loading the resulting 9 tiles into QGIS and generating a virtual raster we see a nice seamless DBM without any edge artifacts.

The DEM of the 9 tiles computed with buffers created by lastile has no edge artifacts acoss tile boundaries.

The DBM of the 9 tiles computed with buffers created by lastile has no edge artifacts acoss tile boundaries.

If you need to deliver the LiDAR files you should remove the buffers with lastile and option ‘-remove_buffer’.

lastile -i 1_classify\futi*.laz ^
        -remove_buffer ^
        -odir 1_final -olaz ^
        -cores 4

2. On-the-fly buffering

Now assume you are given LiDAR tiles without buffers. We generate them here with lastile.

lastile -i ..\data\fusa.laz ^
        -set_classification 0 -set_user_data 0 ^
        -tile_size 100 ^
        -odir 2_raw -o futi.laz

The only difference is that we do not request the 20 meter buffer and the result is a typical tiling as you may receive it from a vendor or download it from a LiDAR portal. The GUI for LAStools shows that there is no overlap and if you scrutinize the bounding box values that the cursor points to, you see that the tiles is exactly 100 meters bty 100 meters.

tiles_without_buffer

Now we have to think about buffers a lot. When using on-the-fly buffering we should first spatially index the tiles with lasindex for faster access to the points from neighbouring tiles.

lasindex -i 1_raw\futi*.laz -cores 4

Below in red are the modifications for on-the-fly buffering to the standard workflow of lasground, lasheight, and lasclassify. The first lasground run uses ‘-buffered 20’ to add buffers to each tile and ‘-remain_buffered’ to write those buffers to disk. This way they do not have to created again by lasheight and lasclassify.

lasground -i 2_raw\futi*.laz ^
          -buffered 20 -remain_buffered ^
          -city ^
          -odir 2_ground -olaz ^
          -cores 4
lasheight -i 2_ground\futi*.laz ^
          -remain_buffered ^
          -drop_above 50 ^
          -odir 2_height -olaz ^
          -cores 4
lasclassify -i 2_height\futi*.laz ^
            -remain_buffered ^
            -odir 2_classify -olaz ^
            -cores 4

At the end we have to remember that the tiles still have on-the-fly buffers and them cut off with option ‘-use_orig_bb’ of las2dem.

las2dem -i 2_classify\futi*.laz ^
        -keep_class 2 6 ^
        -step 0.25 -use_orig_bb ^
        -odir 2_dbm -obil ^
        -cores 4

Again, we created a digital terrain model with buildings (DBM) by keeping the points with classification 2 (ground) and 6 (building). The resulting hillshade computed from a virtual raster that combines the 9 BIL rastera into one looks perfectly smooth in QGIS.

The hillshaded DEM of the 9 tiles computed with on-the-fly buffering has no edge artifacts acoss tile boundaries.

The hillshaded DBM of 9 tiles computed with on-the-fly buffering has no edge artifacts acoss tile boundaries.

If you need to deliver the LiDAR files you should probably remove the buffers first … but that is not yet implemented. (-:

lastile -i 2_classify\futi*.laz ^
        -remove_buffer ^
        -odir 2_final -olaz ^
        -cores 4

3. Bad: No buffering

Here what you are *not* supposed to do. Assuming you get unbuffered tiles.

lastile -i ..\data\fusa.laz ^
        -set_classification 0 -set_user_data 0 ^
        -tile_size 100 ^
        -odir 3_raw -o futi.laz

Bad. You do not take care about buffering when processing the tiles.

lasground -i 3_raw\futi*.laz ^
          -city ^
          -odir 3_ground -olaz ^
          -cores 4
lasheight -i 3_ground\futi*.laz ^
          -drop_above 50 ^
          -odir 3_height -olaz ^
          -cores 4
lasclassify -i 3_height\futi*.laz ^
            -odir 3_classify -olaz ^
            -cores 4

Bad. You do not take care about buffering when generating the DBM.

las2dem -i 3_classify\futi*.laz ^
        -keep_class 2 6 ^
        -step 0.25 ^
        -odir 3_dbm -obil ^
        -cores 4

Bad. You get crappy results with edge artifacts clearly visible in the hillshade.

The hillshaded DBM of 9 tiles computed WITHOUT using buffers has severe edge artifacts acoss tile boundaries.

The hillshaded DBM of 9 tiles computed WITHOUT using buffers has severe edge artifacts acoss tile boundaries.

Bad. If you zoom in on a corner where 4 tiles meet you find missing pixels and incorrect elevation values. Bad. Bad. Bad. So please folks. Try this on your own data. Notice the horrible edge artifacts. Then always use buffers … (-:

PS: Usually no buffers are needed for running lasgrid, lasoverlap, or lascanopy as they perform simple binning operations that do not make use of neighbour information.

Preparing raw LiDAR for efficient (online) distribution

On August 14th, Prof. David Pyle tweeted about raw LiDAR being publically available for the vulcanic island Nea Kameni and its little sibling Palea Kameni that are part of the Santorini caldera in Greece. A big “kudos” to all those like Prof. Pyle who share ‎‎raw LiDAR‬ data online for all to download – be it for transparent research, as an open data policy, or to enable innovation. Although fancy download portals like OpenTopography are great, already a folder full of files accessible via simple FTP or HTTP is an incredible resource. For the latter, here are some tips from rapidlasso how to prepare your raw LiDAR flightlines with LAStools so that they are as good as they can be for those that download them … (-:

The twelve LiDAR flight lines for the Kameni islands are a perfectly-sized example on how to prepare raw LiDAR for efficient (online) distribution. They are provided both in the LAS format as well as simple ASCII files. Stored in the LAS format, the twelve strips (1,2,3,4,5,6,7,8,9,10,11,12)  are about 1 Gigabyte:

lidar_for_download_santorini_size_raw

A lasinfo report tells us the following:
(1) The exporting software used the wrong GeoTIFF tag to specify the UTM 35 (north) projection via EPSG code 32635.
(2) There are four proprietary “LeicaGeo” VLRs totalling 22000 bytes stored in each header. Does someone know what they contain and how to read them?
(3) There are two legacy bytes following the header that are part of the (now somewhat dated) LAS 1.0 specification.
(4) The coordinates are stored with millimeter resolution (i.e. the scale factors are 0.001). This is an overkill for airborne LiDAR. Those millimeters are just scanning noise, are miss-leading, and negatively affect compression.
(5) The file does not store flight line numbers in the “file source ID” field.

lidar_for_download_santorini_lasinfo_raw

Visualizing the twelve flight lines with lasview further illustrates:
(1) A cruise ship and a cloud was captured. the latter is classified as noise (7).
(2) An intensity value is stored for each return.
(3) Almost all laser shots resulted in a single return, which is not surprising for a vulcanic island without vegetation. The multi-returns from the cloud are a colorful exception.
(4) There are some strange (=> useless) numbers stored in the “point source ID” field of each point that should really store the flight line number of each point.

We suggest to fix up this raw LiDAR to make it more efficient and useful for those receiving it as follows. We run las2las to do the following:
(1) Remove the legacy two bytes following the header.
(2) Switch to the 1.2 version of the LAS format.
(3) Remove all existing VLRs.
(4) Set the horizontal projection to EPSG code 32635 and the vertical datum to WGS84.
(5) Set the file source ID of each file and the point source ID of each point to the same value. This values starts at one and is incremented with each file.
(6) Rescale coordinate resolution to centimeters.
(7) Append the string ‘_cm’ to the original file name to output LASzip-compressed files that end in ‘*.laz’.

las2las -i *.LAS ^
        -remove_extra ^
        -set_version 1.2 ^
        -remove_all_vlrs ^
        -epsg 32635 -vertical_wgs84 ^
        -files_are_flightlines ^
        -rescale 0.01 0.01 0.01 ^
        -odix _cm -olaz

This result in much much smaller files that are easier to host and faster to download. We achieve almost a factor 10 in compression as the new files are only 11.2 % of the size of the original files.

lidar_for_download_santorini_size_fixed

We also get a much cleaner raw LiDAR file that has only meaningful VLRs, more suitable coordinate resolution, and properly populated flight line information:

lidar_for_download_santorini_lasinfo_fixed

And we now have flightline information that tells us, for example, that the (blue) cloud was captured by the third flight line.

lidar_for_download_santorini_05

We noticed something odd when loading the files with lasview: Although most flightlines were still in acquisition order (i.e. the points in the file are in the order acquired by the scanner), flightline 8 and 10 were not. Perfectionists like us employ lassort with option ‘-gps_time’ and reorder all twelve files by GPS time on 4 cores in parallel.

lassort -i LDR*cm.laz ^
        -gps_time ^
        -odix _sort -olaz ^
        -cores 4

Several calls to lasdiff – a tool that reports any content and order difference between two files – confirm that only points of flightline 8 and 10 are actually reordered by the above call. Here two example calls:

lidar_for_download_santorini_lasdiff_sorted

One – rather new – thing that we are now recommend doing is to also add spatial indexing information and do this by storing the little LAX files directly into the compressed LAZ files. This can be done in-place with

lasindex -i LDR*cm.laz ^
         -append ^
         -cores 8

Such spatial indexing information (see the video of our ELMF 2012 talk for more details) allows faster spatial queries that only read and decompress the actually queried area-of-interest. This is already used by all LAStools and can be exploited via the LASlib application programming interface. Soon there will also be support for spatially-indexed queries in the LASzip compression DLL.

lidar_for_download_santorini_06

Finally, should you have ortho photo imagery in TIF format (like it was kindly provided here by Prof. David Pyle) then you can embellish your LiDAR points with RGB using lascolor as shown here:

lascolor -i LDR*_cm.laz ^
         -image KameniOrtho.tif ^
         -odix _col -olaz ^
         -cores 4

Now you have raw LiDAR strips for online distribution that are as good as they get … (-:

lidar_for_download_santorini_07

Esri and rapidlasso develop joint LiDAR compressor

PRESS RELEASE (April Fools’ Day)
April 1, 2014
rapidlasso GmbH, Gilching, Germany

In a positive spin of events, Esri and rapidlasso are announcing to join forces and together develop a LiDAR compressor for LAS 1.4 in open source avoiding unnecessary format fragmentation. Their new “LASeasy” tool not only compresses but also optimizes LAS files for efficient area-of-interest queries. LASeasy extends the popular LASzip compressor to handle LAS 1.4 content and includes the tiny spatial indexing *.lax files into the *.laz file via Extended Variable Length Records (EVLRs). More importantly, LASeasy provides new features such as optional spatial sorting and precomputed statistics – motivated by Esri – that allow exploiting LiDAR in the cloud.

To minimize disruption in existing workflows, their joint effort uses a clever strategy that capitalizes on the natural “break” in the ASPRS LAS format from version 1.3 to 1.4. LAS files compressed by Esri will automatically be upgraded to the new point types introduced with LAS 1.4 (and be losslessly downgraded on decompression). LiDAR software already supporting LAZ will instantly be able to read all LiDAR produced by Esri with the same DLL update that will be needed to access future compressed LAS 1.4 content – achieving maximum compatibility with minimal disruption for users of ArcGIS, LASzip, and the larger LiDAR community,

Martin Isenburg, chief scientist and CEO of rapidlasso GmbH, was all smiles during the announement. “Yes, I had some hard feelings when hearing about their ‘LAZ clone‘ because our presumed open dialogue suddenly felt so very one-sided,” he said, “So over Martin Luther King weekend I proposed this LAS 1.4 trick as a joint development quoting MLK’s ‘We must accept finite disappointment, but never lose infinite hope’ and that seemed to resonate with them.” Speaking on the condition of anonymity an executive of Esri’s management added “For a global geospatial player like us it can happen that we do something ‘evil-by-accident’. We occasionally need someone like Martin to poke some good-natured fun at Esri to remind us of our values.”

LASeasy optimizes LAS files by reordering points along an adaptive space-filling curve for efficient LiDAR queries in the cloud. To access the corner of the LiDAR tile only the points shown in blue need to be loaded and decompressed.

LASeasy optimizes LAS files by reordering points along an adaptive space-filling curve for efficient LiDAR queries in the cloud. To access the corner of the LiDAR tile only the points shown in blue need to be loaded and decompressed.

Warming up for ILMF 2014, rapidlasso puts lean yet plush “LASlayers” on LiDAR

PRESS RELEASE
February 14, 2014
rapidlasso GmbH, Gilching, Germany

As a sweet foretaste to ILMF 2014, the creators of LAStoolsLASzip, and PulseWaves are announcing “LASlayers” already on Valentine’s Day. The new functionality nicely complements their popular and widely-used LiDAR compressor making the compressed LAZ files editable for most practical purposes. LASlayers significantly reduce I/O load for writing modification to LAS or LAZ files, especially when batch-processing LiDAR tiles on many cores in parallel or when sending changes to LiDAR files across bandwidth-limited networks.

Conceptually LASlayers add

LASlayers store modifications and additional  attributes to raw LAS or LAZ files in small LAY files avoiding to replicate data that has not changed.

Most point attributes (e.g. coordinates, intensities, scan angles, GPS times, etc.) are not modified when processing LiDAR. LASlayers avoids re-writing the unchanged portions of LAS or LAZ by storing only the actual differences layer by layer to a new “LAY” file. Changing the point classifications or deleting a few points, for example, can be done with LAY files that are just a tiny fraction of the size of a new LAS or LAZ file. Adding new attributes such as RGB colors or the height-above-ground with LASlayers means only this new information needs to be written.

This also provides simultaneous access to different versions of the data: a LiDAR server or a Web portal can store only a single copy of the raw LiDAR and apply LASlayers as needed on-the-fly, for example, to replace ellipsoidal with orthometric elevations or to add RGB colors.

Even users of other LiDAR processing software can readily take advantage of LASlayers with the new “laslayers.exe” tool that computes the difference between a raw and a modified LAS or LAZ file and expresses it as a LAY file (assuming the point order has not changed). A typical use case is the exchange of modifications to LiDAR files between remote locations such as a vendor in Australia or Canada and a data processing center in China or India. Instead of up- and downloading the entire modified LAS or LAZ files, only the much smaller LAY files need to be send back and forth.

A fully featured prototype of LASlayers is available (10 MB including data) together with three simple exercises that illustrate the concept and allow interested parties to test it already today on their own data.