Plots to Stands: Producing LiDAR Vegetation Metrics for Imputation Calculations

Some professionals in remote sensing find LAStools a useful tool to extract statistical metrics from LiDAR that are used to make estimations about a larger area of land from a small set of sample plots. Common applications are prediction of the timber volume or the above-ground biomass for entire forests based on a number of representative plots where exact measurements were obtained with field work. The same technique can also be used to make estimations about animal habitat or coconut yield or to classify the type of vegetation that covers the land. In this tutorial we describe the typical workflow for computing common metrics for smaller plots and larger areas using LAStools.

Download these six LiDAR tiles (1, 2, 3, 4, 5, 6) from a Eucalyptus plantation in Brazil to follow along the step by step instructions of this tutorial. This data is courtesy of Suzano Pulp and Paper. Please also download the two shapefiles that delineate the plots where field measurements were taken and the stands for which predictions are to be made. You should download version 170327 (or higher) of LAStools due to some recent bug fixes.

Quality Checking

Before processing newly received LiDAR data we always perform a quality check first. This ranges from visual inspection with lasview, to printing textual content reports and attribute histograms with lasinfo, to flight-line alignment checks with lasoverlap, pulse density and pulse spacing checks with lasgrid and las2dem, and completeness-of-returns check with lassort followed by lasreturn.

lasinfo -i tiles_raw\CODL0003-C0006.laz ^
        -odir quality -odix _info -otxt

The lasinfo report tells us that there is no projection information. However, we remember that this Brazilian data was in the common SIRGAS 2000 projection and try for a few likely UTM zones whether the hillshaded DSM produced by las2dem falls onto the right spot in Google Earth.

las2dem -i tiles_raw\CODL0003-C0006.laz ^
        -keep_first -thin_with_grid 1 ^
        -hillshade -epsg 31983 ^
        -o epsg_check.png

Hillshaded DSM and Google Earth imagery align for EPSG code 31983

The lasinfo report also tells us that the xyz coordinates are stored with millimeter resolution which is a bit of an overkill. For higher and faster LASzip compression we will later lower this to a more appropriate centimeter resolution. It further tells us that the returns are stored using point type 0 and that is a bit unfortunate. This (older) point type does not have a GPS time stamp so that some quality checks (e.g. “completeness of returns” with lasreturn) and operations (e.g. “resorting of returns into acquisition order” with lassort) will not be possible. Fortunately the min-max range of the ‘point source ID’ suggests that this point attribute is correctly populated with flightline numbers so that we can do a check for overlap and alignment of the different flightlines that contribute to the LiDAR in each tile.

lasoverlap -i tiles_raw\*.laz ^
           -min_diff 0.2 -max_diff 0.4 ^
           -epsg 31983 ^
           -odir quality -opng ^
           -cores 3

We run lasoverlap to visualize the amount of overlap between flightlines and the vertical differences between them. The produced images (see below) color code the number of flightlines and the maximum vertical difference between any two flightlines as seen below. Most of the area is cyan (2 flightlines) except in the bottom left where the pilot was sloppy and left some gaps in the yellow seams (3 flightlines) so that some spots are only blue (1 flightline). We also see that two tiles in the upper left are partly covered by a diagonal flightline. We will drop that flightline later to create a more uniform density.across the tiles. The mostly blue areas in the difference are mostly aligned with features in the landscape and less with the flightline pattern. Closer inspection shows that these vertical difference occur mainly in the dense old growth forests with species of different heights that are much harder to penetrate by the laser than the uniform and short-lived Eucalyptus plantation that is more of a “dead forest” with little undergrowth or animal habitat.

Interesting observation: The vertical difference of the lowest return from different flightlines computed per 2 meter by 2 meter grid cell could maybe be used a new forestry metric to help distinguish mono cultures from natural forests.

lasgrid -i tiles_raw\*.laz ^
        -keep_last ^
        -step 2 -point_density ^
        -false -set_min_max 10 20 ^
        -epsg 31983 ^
        -odir quality -odix _d_2m_10_20 -opng ^
        -cores 3

lasgrid -i tiles_raw\*.laz ^
        -keep_last ^
        -step 5 -point_density ^
        -false -set_min_max 10 20 ^
        -epsg 31983 ^
        -odir quality -odix _d_5m_10_20 -opng ^
        -cores 3

We run lasgrid to visualize the pulse density per 2 by 2 meter cell and per 5 by 5 meter cell. The produced images (see below) color code the number of last return per square meter. The impact of the tall Eucalyptus trees on the density per cell computation is evident for the smaller 2 meter cell size in form of a noisy blue/red diagonal in the top right as well as a noisy blue/red area in the bottom left. Both of those turn to a more consistent yellow for the density per cell computation with larger 5 meter cells. Immediately evident is the higher density (red) for those parts or the two tiles in the upper left that are covered by the additional diagonal flightline. The blueish area left to the center of the image suggests a consistently lower pulse density whose cause remains to be investigated: Lower reflectivity? Missing last returns? Cloud cover?

The lasinfo report suggests that the tiles are already classified. We could either use the ground classification provided by the vendor or re-classify the data ourselves (using lastilelasnoise, and lasground). We check the quality of the ground classification by visually inspecting a hillshaded DTM created with las2dem from the ground returns. We buffer the tiles on-the-fly for a seamless hillshade without artifacts along tile boundaries by adding ‘-buffered 25’ and ‘-use_orig_bb’ to the command-line. To speed up reading the 25 meter buffers from neighboring tiles we first create a spatial indexing with lasindex.

lasindex -i tiles_raw\*.laz ^
         -cores 3

las2dem -i tiles_raw\*.laz ^
        -buffered 25 ^
        -keep_class 2 -thin_with_grid 0.5 ^
        -use_orig_bb ^
        -hillshade -epsg 31983 ^
        -odir quality -odix _dtm -opng ^
        -cores 3

hillshaded DTM tiles generated with las2dem and on-the-fly buffering

The resulting hillshaded DTM shows a few minor issues in the ground classification but also a big bump (above the mouse cursor). Closer inspection of this area (you can cut it from the larger tile using the command-line below) shows that there is a group of miss-classified points about 1200 meters below the terrain. Hence, we will start from scratch to prepare the data for the extraction of forestry metrics.

las2las -i tiles_raw\CODL0004-C0006.laz ^
        -inside_tile 207900 7358350 100 ^
        -o bump.laz

lasview -i bump.laz

bump in hillshaded DTM caused by miss-classified low noise

Data Preparation

Using lastile we first tile the data into smaller 500 meter by 500 meter tiles with 25 meter buffer while flagging all points in the buffer as ‘withheld’. In the same step we lower the resolution to centimeter and put nicer a coordinate offset in the LAS header. We also remove the existing classification and classify all points that are much lower than the target terrain as class 7 (aka noise). We also add CRS information and give each tile the base name ‘suzana.laz’.

lastile -i tiles_raw\*.laz ^
        -rescale 0.01 0.01 0.01 ^
        -auto_reoffset ^
        -set_classification 0 ^
        -classify_z_below_as 500.0 7 ^
        -tile_size 500 ^
        -buffer 25 -flag_as_withheld ^
        -epsg 31983 ^
        -odir tiles_buffered -o suzana.laz

With lasnoise we mark the many isolated points that are high above or below the terrain as class 7 (aka noise). Using two tiles we played around with the ‘step’ parameters until we find good parameter settings. See the README of lasnoise for the exact meaning and the choice of parameters for noise classification. We look at one of the resulting tiles with lasview.

lasnoise -i tiles_buffered\*.laz ^
         -step_xy 4 -step_z 2 ^
         -odir tiles_denoised -olaz ^
         -cores 3

lasview -i tiles_denoised\suzana_206000_7357000.laz ^
        -color_by_classification ^
        -win 1024 192

noise points shown in pink: all points (top), only noise points (bottom)

Next we use lasground to classify the last returns into ground (2) and non-ground (1). It is important to ignore the noise points with classification 7 to avoid the kind of bump we saw in the vendor-delivered classification. We again check the quality of the computed ground classification by producing a hillshaded DTM with las2dem. Here the las2dem command-line is sightly different as instead of buffering on-the-fly we use the buffers stored with each tile.

lasground -i tiles_denoised\*.laz ^
          -ignore_class 7 ^
          -nature -extra_fine ^
          -odir tiles_ground -olaz ^
          -cores 3

las2dem -i tiles_ground\*.laz ^
        -keep_class 2 -thin_with_grid 0.5 ^
        -hillshade ^
        -use_tile_bb ^
        -odir quality -odix _dtm_new -opng ^
        -cores 3

Finally, with lasheight we compute how high each return is above the triangulated surface of all ground returns and store this height value in place of the elevation value into the z coordinate using the ‘-replace_z’ switch. This height-normalizes the LiDAR in the sense that all ground returns are set to an elevation of 0 while all other returns get an elevation relative to the ground. The result are height-normalized LiDAR tiles that are ready for producing the desired forestry metrics.

lasheight -i tiles_ground\*.laz ^
          -replace_z ^
          -odir tiles_normalized -olaz ^
          -cores 3
Metric Production

The tool for computing the metrics for the entire area as well as for the individual field plots is lascanopy. Which metrics are well suited for your particular imputation calculation is your job to determine. Maybe first compute a large number of them and then eliminate the redundant ones. Do not use any point from the tile buffers for these calculations. We had flagged them as ‘withheld’ during the lastile operation, so they are easy to drop. We also want to drop the noise points that were classified as 7. And we were planning to drop that additional diagonal flightline we noticed during quality checking. We generated two lasinfo reports with the ‘-histo point_source 1’ option for the two tiles it was covering. From the two histograms it was easy to see that the diagonal flightline has the number 31.

First we run lascanopy on the 11 plots that you can download here. When running on plots it makes sense to first create a spatial indexing with lasindex for faster querying so that only those tiny parts of the LAZ file need to be loaded that actually cover the plots.

lasindex -i tiles_normalized\*.laz ^
         -cores 3

lascanopy -i tiles_normalized\*.laz -merged ^
          -drop_withheld ^
          -drop_class 7 ^
          -drop_point_source 31 ^
          -lop WKS_PLOTS.shp ^
          -cover_cutoff 3.0 ^
          -cov -dns ^
          -height_cutoff 2.0 ^
          -c 2.0 999.0 ^
          -max -avg -std -kur ^
          -p 25 50 75 95 ^
          -b 30 50 80 ^
          -d 2.0 5.0 10.0 50.0 ^
          -o plots.csv

The resulting ‘plots.csv’ file you can easily process in other software packages. It contains one line for each polygonal plot listed in the shapefile that lists its bounding box followed by all the requested metrics. But is why is there a zero maximum height (marked in orange) for plots 6 though 10? All height metrics are computed solely from returns that are higher than the ‘height_cutoff’ that was set to 2 meters. We added the ‘-c 2.0 999.0’ absolute count metric to keep track of the number of returns used in these calculations. Apparently in plots 6 though 10 there was not a single return above 2 meters as the count (also marked in orange) is zero for all these plots. Turns out this Eucalyptus stand had recently been harvested and the new seedlings are still shorter than 2 meters.

more plots.csv
index,min_x,min_y,max_x,max_y,max,avg,std,kur,p25,p50,p75,p95,b30,b50,b80,c00,d00,d01,d02,cov,dns
0,206260.500,7358289.909,206283.068,7358312.477,11.23,6.22,1.91,2.26,4.71,6.01,7.67,9.5,26.3,59.7,94.2,5359,18.9,41.3,1.5,76.3,60.0
1,206422.500,7357972.909,206445.068,7357995.477,13.54,7.5,2.54,1.97,5.32,7.34,9.65,11.62,26.9,54.6,92.2,7030,12.3,36.6,13.3,77.0,61.0
2,206579.501,7358125.909,206602.068,7358148.477,12.22,5.72,2.15,2.5,4.11,5.32,7.26,9.76,46.0,73.7,97.4,4901,24.8,28.7,2.0,66.8,51.2
3,206578.500,7358452.910,206601.068,7358475.477,12.21,5.68,2.23,2.64,4.01,5.14,7.18,10.04,48.3,74.1,95.5,4861,25.7,26.2,2.9,68.0,50.2
4,206734.501,7358604.910,206757.068,7358627.478,15.98,10.3,2.18,2.64,8.85,10.46,11.9,13.65,3.3,27.0,91.0,4946,0.6,32.5,44.5,91.0,77.5
5,207043.501,7358761.910,207066.068,7358784.478,15.76,10.78,2.32,3.43,9.27,11.03,12.49,14.11,3.2,20.7,83.3,4819,1.5,24.7,51.0,91.1,76.8
6,207677.192,7359630.526,207699.760,7359653.094,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
7,207519.291,7359372.366,207541.859,7359394.934,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
8,207786.742,7359255.850,207809.309,7359278.417,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
9,208159.017,7358997.344,208181.584,7359019.911,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
10,208370.909,7358602.565,208393.477,7358625.133,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0

Then we run lascanopy on the entire area and produce one raster per tile for each metric. Here we remove the buffered points with the ‘-use_tile_bb’ switch that also ensures that the produced rasters have exactly the extend of the tiles without buffers. Again, it is imperative that you download the version 170327 (or higher) of LAStools for this to work correctly.

lascanopy -version
LAStools (by martin@rapidlasso.com) version 170327 (academic)

lascanopy -i tiles_normalized\*.laz ^
          -use_tile_bb ^
          -drop_class 7 ^
          -drop_point_source 31 ^
          -step 10 ^
          -cover_cutoff 3.0 ^
          -cov -dns ^
          -height_cutoff 2.0 ^
          -c 2.0 999.0 ^
          -max -avg -std -kur ^
          -p 25 50 75 95 ^
          -b 30 50 80 ^
          -d 2.0 5.0 10.0 50.0 ^
          -odir tile_metrics -oasc ^
          -cores 3

The resulting rasters in ASC format can easily be previewed using lasview for some “sanity checking” that our metrics make sense and to get a quick overview about what these metrics look like.

lasview -i tile_metrics\suzana_*max.asc
lasview -i tile_metrics\suzana_*p95.asc
lasview -i tile_metrics\suzana_*p50.asc
lasview -i tile_metrics\suzana_*p25.asc
lasview -i tile_metrics\suzana_*cov.asc
lasview -i tile_metrics\suzana_*d00.asc
lasview -i tile_metrics\suzana_*d01.asc
lasview -i tile_metrics\suzana_*b30.asc
lasview -i tile_metrics\suzana_*b80.asc

The maximum height rasters are useful to inspect more closely as they will immediately tell us if there was any high noise point that slipped through the cracks. And indeed it happened as we see a maximum of 388.55 meters for of the 10 by 10 meter cells. As we know the expected height of the trees we could have added a ‘-drop_z_above 70’ to the lascanopy command line. Careful, however, when computing forestry metrics in strongly sloped terrains as the terrain slope can significantly lift up returns to heights much higher than that of the tree. This is guaranteed to happen for LiDAR returns from branches that are extending horizontally far over the down-sloped part of the terrain as shown in this paper here.

We did not use the shapefile for the stands in this exercise. We could have clipped the normalized LiDAR points to these stands using lasclip as shown in the GUI below before generating the raster metrics. This would have saved space and computation time as many of the LiDAR points lie outside of the stands. However, it might be better to do that clipping step on the rasters in whichever GIS software or statistics package you are using for the imputation computation to properly account for partly covered raster cells along the stand boundary. This could be subject of another blog article … (-:

not all LiDAR was needed to compute metrics for

Pre-Processing Mobile Rail LiDAR with LAStools

The majority of LAStools users are processing airborne LiDAR. That should not surprise as airborne is by far the most common form of LiDAR in terms of square kilometers covered. The availability of LiDAR as “open data” is also pretty much restricted to airborne surveys, which are often tax-payer funded and then distributed freely to achieve maximum return of investment.

But folks are increasingly using our software to do some of the “heavy lifting” for mobile LiDAR, either mounted on a truck for scanning cities or on a train for capturing railroad infrastructure. The LiDAR collected for the cities of Budapest and Singapore, for example, was pre-processed by multi-core scripted LAStools when the scanning trucks returned with their daily trajectories worth of point clouds captured by a RIEGL VMX-450 mobile mapping system.

One customer who was recently scanning railroad infrastructure wanted to do automatic ground classification as a first step prior to further segmentation of the data. We were asked for advice because on such data the standard settings of lasground left too many patches of ground unclassified. Also the uniform tiling lastile generates by default is not a good way to break such data into manageable pieces given the drastically varying point densities in mobile scanning.

We obtained a 217 MB file in LAZ format with 40 million points corresponding to a 2.7 km stretch of railway track. We first run a quick lasindex (with the options for ‘mobile’) on the file that creates a spatial indexing LAX file with maximally 10 meter resolution. This not only allows faster area-of-interest queries but also gives us a more detailed preview than just the bounding box of where the LiDAR points actually are in the GUI of LAStools.

mobile_rail_lidar_01

Presence of LAX files results in actual extend of LiDAR being shown in GUI.

lasindex -i segment.laz -tile_size 10 -maximum -100

We then run lastile four times to create an adaptive tiling in which no tile has more than 6 million points. The first call creates the initial 1000 by 1000 meter tiles. The following three calls refine all those tiles that still have more than 6 million points first into 500 by 500 meter, then 250 by 250 meter, and finally 125 by 125 meter tiles in parallel on 4 cores. Note the ‘-refine_tiling’ option is used in the first call to lastile and the ‘-refine_tiles’ option in all subsequent calls.

lastile -i segment.laz ^
        -tile_size 1000 ^
        -buffer 10 -flag_as_withheld ^
        -refine_tiling 6000000 ^
        -odir tiles_raw -o rail.laz
lastile -i tiles_raw\*_1000.laz ^
        -flag_as_withheld ^
        -refine_tiles 6000000 ^
        -olaz ^
        -cores 4
lastile -i tiles_raw\*_500.laz ^
        -flag_as_withheld ^
        -refine_tiles 6000000 ^
        -olaz ^
        -cores 4
lastile -i tiles_raw\*_250.laz ^
        -flag_as_withheld ^
        -refine_tiles 6000000 ^
        -olaz ^
        -cores 4

The resulting tiles all have fewer than 6 million points but still have the initial 10 meter buffer that was specified by the first call to lastile. Two tiles were sufficiently small after the 1st call, three tiles after the 2nd call, eleven tiles after 3rd call, and three tiles after the 4th.

contents of tile shown in blue in adaptive tiling below

points of adaptive tile (high-lighted in blue below) colored by intensity

Adaptive tiling created with four calls to lastile.

Adaptive tiling created with four calls to lastile. Scale factors of 0.00025 (see mouse cursor) implies that point coordinates are stored with quarter millimeter resolution. Lowering them to 0.001 would result in better compression and lower I/O.

Noise in the data – especially low noise – can lead lasground into choosing the wrong points during ground classification by latching on to those low noise points. We first classify the noise points into a different class (7) using lasnoise so we can later ignore them. These particular settings were found by experimenting on a few tiles with different values (see the README file) until visual inspection showed that most low points had been classified as noise.

lasnoise -i tiles_raw\*.laz ^
         -step_xy 0.5 -step_z 0.1 ^
         -odir tiles_denoised -olaz ^
         -cores 4
noise points shown in violett

noise points shown in violett

The points classified as noise will not be considered as ground points during the next step. For this it matters little that lamp posts, wires, or vegetation are wrongly marked as noise now. We can always undo their noise classification once the ground points were classified. Important is that those pointed to by the mouse cursor, which are below the desired ground, are excluded from consideration during the ground classification step. Here those low points are not actually noise but returns generated wherever the laser was able to “peek” through an opening to a lower surface.

lasground -i tiles_denoised\*.laz ^
          -ignore_class 7 ^
          -step 1 -sub 3 -bulge 0.1 -spike 0.1 -offset 0.02 ^
          -odir tiles_ground -olaz ^
          -cores 4

For classification with lasground there are a number of options to play with  (see the README file) but the most important is the correct step size. It is terrain along the railway track bed that is supposed to get represented well. The usual step of 5 to 40 meter for lasground aim at the removal of vegetation and man-made structures from airborne LiDAR. They are not the right choice here. A step of 1 and the parameters shown above gives us the ground shown below.

Classification of terrain along railway track using lasground with '-step 1'

Classification of terrain along railway track bed using lasground with ‘-step 1’

The new ‘-flag_as_withheld’ option in lastile that flags each point in the buffer with the withheld flag is useful in case we want to remove all buffer points on-the-fly, for example, in order to create a DTM hillshade of 25 cm resolution for a visual quality check of the entire 2.7 km track using blast2dem from the BLAST extension of LAStools.

blast2dem -i tiles_ground\*.laz -merged ^
          -drop_withheld -keep_class 2 ^
          -hillshade -step 0.25 ^
          -o dtm_hillshaded.png
Small 600 x 600 pixel detail of hill-shaded 5663 x 9619 pixel DTM raster generated by blast2dem

Small 600 x 600 pixel detail of hill-shaded 5663 x 9619 pixel DTM raster generated by blast2dem.

The dArc Force Awakens: ESRI escalates LiDAR format war

The empire has not changed their evil ways, despite an encouraging email from ESRI’s founder and president Jack Dangermond in response to the Open Letter by the OSGeo that was delivered to ESRI, OGC, and the ASPRS. Facing an incredible backlash by the LiDAR community over the release of their “LAZ clone” there was a new hope that unnecessary format fragmention could be avoided by working together within the Point Cloud Domain Working Group of the OGC. In fact only one thing happened: ESRI went silent on the controversy. They temporarily stopped promoting their “LAZ clone” and focused on locking in more content.

dArc_force_awakens

The message of the rebellion has been consistent and clear like in these two videos from the TC meeting of the OGC in Nottingham and the ASPRS side bar in Reno: a roadmap forward to avoid format fragmentation by exploiting the “natural break” in the format due to LAS 1.4. But there was zero technical contribution from ESRI during the past three PC-DWG meetings of the OGC. The slide sets that bored the audiences in Boulder and in Nottingham were not meant to contribute but merely stalled for time. Recently in Sydney ESRI was awefully quiet, knowing they were doing the exact opposite of what the OGC stands for. And now the empire strikes back.

laztozlas

There is a dArc force awakening that threatens the peace within the LiDAR community. ESRI has just released a new tool (see above) that enslaves point clouds by converting them from the open LAZ format to the near-identical but closed “LAZ clone” that they call “zLAS” or “Optimized LAS”. This comes just a few months after an entire nation‘s LiDAR was enslaved in this proprietary format. We have repeatedly warned about the ramifications of locking up Petabytes of LiDAR data in a closed format that is controlled by a single vendor.

ESRI is one of the largest GIS training organizations. By instructing LiDAR novices to “optimize” their LiDAR files and pushing LiDAR providers to switch from open LAS or open LAZ to closed zLAS, they effectively destroy the current success of our open formats. ESRI’s command of the GIS market can – little by little – turn their own proprietry format into the dominant way in which LiDAR point clouds are stored. Then we loose our open exchange formats. Hence, ESRI’s proprietary format threatens all that we have achieved with LAS (and LAZ) over the past years: compatible LiDAR data exchange and incredible LiDAR software interoperability.

ESRI is now escalating the LiDAR format wars. Join the rebellion, Jedis: download your lazer sabers and liberate some LiDAR.

This is not an anti-ESRI campaign. For the past three years we have been trying to resolve this situation. We have repeatedly reached out to ESRI to prevent format fragmentation. We have repeatedly offered to create a joint compressed format. We have plead, begged, and bargained for the sake of our LiDAR community and the sake of their ArcGIS user community not to promote a near-identical yet incompatible way for storing massive amounts of point cloud data.

RIEGL Becomes LASzip Sponsor for LAS 1.4 Extension

PRESS RELEASE (for immediate release)
August 31, 2015
rapidlasso GmbH, Gilching, Germany

We are happy to announce that RIEGL Laser Measurement Systems, Austria has become a sponsor of the award-winning LASzip compressor. Their contribution at the Silver level will kick-off the actual development phase of the “native LAS 1.4 extension” that had been discussed with the LiDAR community over the past two years. This “native extension” for LAS 1.4 complements the existing “compatibility mode” for LAS 1.4 that was supported by Gold sponsor NOAA and Bronze sponsors Quantum Spatial and Trimble Geospatial. The original sponsor who initiated and financed the open sourcing of the LASzip compressor was USACE – the US Army Corps of Engineers (see http://laszip.org).

The existing “LAS 1.4 compatibility mode” in LASzip was created to provide immediate support for compressing the new LAS 1.4 point types by rewriting them as old point types and storing their new information as “Extra Bytes”. As an added side-benefit this has allowed legacy software without LAS 1.4 support to readily read these newer LAS files as most of the important fields of the new point types 6 to 10 can be mapped to fields of the older point types 1, 3, or 5.

In contrast, the new “native LAS 1.4 extension” of LASzip that is now sponsored in part by RIEGL will utilize the “natural break” in the format due to the new point types of LAS 1.4 to introduce entirely new features such as “selective decompression”, “rewritable classifications and flags”, “integrated spatial indexing”, … and other functionality that has been brain-stormed with the community since rapidlasso GmbH had issued the open “call for input” on native LASzip compression for LAS 1.4 in January 2014. We invite you to follow the progress or contribute to the development via the discussions in the “LAS room“.

silverLASzip_m60_512_275

About rapidlasso GmbH:
Technology powerhouse rapidlasso GmbH specializes in efficient LiDAR processing tools that are widely known for their high productivity. They combine robust algorithms with efficient I/O and clever memory management to achieve high throughput for data sets containing billions of points. The company’s flagship product – the LAStools software suite – has deep market penetration and is heavily used in industry, government agencies, research labs, and educational institutions. Visit http://rapidlasso.com for more information.

About RIEGL:
Austrian based RIEGL Laser Measurement Systems is a performance leader in research, development and production of terrestrial, industrial, mobile, bathymetric, airborne and UAS-based laser scanning systems. RIEGL’s innovative hard- and software provides powerful solutions for nearly all imaginable fields of application. Worldwide sales, training, support and services are delivered from RIEGL‘s Austrian headquarters and its offices in Vienna, Salzburg, and Styria, main offices in the USA, Japan, and in China, and by a worldwide network of representatives covering Europe, North and South America, Asia, Australia and Africa. Visit http://riegl.com for more information.

Use Buffers when Processing LiDAR in Tiles !!!

We often process LiDAR in tiles for two reasons: first, to keep the number of points per file low and use main memory efficient, and second, to speed up the computation with parallel tile processing and keep all cores of a modern CPU busy. However, it is very (!!!) important to take the necessary precautions to avoid “edge artifacts” when processing LiDAR in tiles. We have to include points from neighboring tiles during certain LAStools processing steps to avoid edge artifacts. Why? Here is an illustration from our PHIL LiDAR tour earlier this year:

Buffers are important to avoid edge artifacts along tile boundaries during DTM creation.

Buffers are important to avoid edge artifacts along tile boundaries.

What you see is the temporary TIN of ground points created (internally) by las2dem or blast2dem that is then rastered at the user-specified step size onto a grid. Without a buffer (right side) there will not always be a triangle to cover every pixel. Especially in the corners of the tile you will often find empty pixels. Furthermore the poorly shaped “sliver triangles” along the boundary of the TIN do not interpolate the ground elevations properly. In contrast, with buffer (left side) the TIN generously covers the entire area that is to be rastered with nicely shaped triangles.

The

Christmas cookie analogy: buffers are like generously rolling out the dough

Here the christmas cookies analogy: You need to roll out the dough larger than the cookies you will cut to make sure your cookies will have nice edges. Think of the TIN as the dough and the square tile as your cookie cutter. You need to use a sufficiently large buffer when you roll out your TIN to assure an edge without crumbles when you cut out the tile … (-: … otherwise you are pretty much guaranteed to get results that – upon closer inspection – have these kind of artifacts:

Without buffers processing artifacts also happen when classifying points with lasground or lasclassify, when calculating height above ground or height-normalizing LiDAR tiles with lasheight, when removing noise with lasnoise, when creationg contours with las2iso or blast2iso, or any other operation where an incomplete neighborhood of points can affect the results. Hence, we need to surround each tile with a temporary buffer of points. Currently there are two ways of working with buffers with LAStools:

  1. creating buffered tiles during the initial tiling step with the ‘-buffer 25’ option of lastile, maintaining buffered tiles throughout processing and finally using the ‘-use_tile_bb’ option of lasgrid, las2dem, blast2dem, or lascanopy to raster the tiles without the temporary buffer.
  2. creating buffered tiles from non-overlapping (= unbuffered) tiles with “on-the-fly” buffering using the ‘-buffered 25’ option of most LAStools such as lasground, lasheight, or las2dem. For some workflows it is useful to also add ‘-remain_buffered’ if buffers are needed again in the next step. Finally, we use the ‘-use_orig_bb’ option of lasgrid, las2dem, blast2dem, or lascanopy to raster the tiles without the temporary buffer.

In the following three (tiny) examples using the venerable ‘fusa.laz’ sample that is distributed with LAStools to illustrate the two types of buffering as well as to show what happens when no buffers are used. In each example we will first cut the small ‘fusa.laz’ sample into nine smaller tiles and then process these separately on 4 cores in parallel.

1. Initial buffer creation with lastile

This is what most of my tutorials teach. It assumes you are the one creating the tiling in the first place. If you do it with lastile and add a buffer right from the start things are pretty easy.

lastile -i ..\data\fusa.laz ^
        -set_classification 0 -set_user_data 0 ^
        -tile_size 100 -buffer 20 ^
        -odir 1_raw -o futi.laz

We cut the input into 100 meter by 100 meter tiles but add a 20 meter buffer around each tile. That means that each tile on disk will contain the points for an area of up to 140 meter by 140 meter. The GUI for LAStools shows the overlap and if you scrutinize the bounding box values that the cursor points to you notice the extra 20 meters in each direction.

tiles_buffered_with_lastile

Now we can forget about the buffers and run the standard workflow consiting of lasground, lasheight, and lasclassify to distinguish ground, vegetation, and building points in the LiDAR tiles.

lasground -i 1_raw\futi*.laz ^
          -city ^
          -odir 1_ground -olaz ^
          -cores 4
lasheight -i 1_ground\futi*.laz ^
          -drop_above 50 ^
          -odir 1_height -olaz ^
          -cores 4
lasclassify -i 1_height\futi*.laz ^
            -odir 1_classify -olaz ^
            -cores 4

At the end – when we generate raster products – we have to remember that the tiles were buffered by lastile and cut off the buffers when we raster the TIN with option ‘-use_tile_bb’ of las2dem.

las2dem -i 1_classify\futi*.laz ^
        -keep_class 2 6 ^
        -step 0.25 -use_tile_bb ^
        -odir 1_dbm -obil ^
        -cores 4

We created a digital terrain model with buildings (DBM) by keeping the points with classification 2 (ground) and 6 (building). After loading the resulting 9 tiles into QGIS and generating a virtual raster we see a nice seamless DBM without any edge artifacts.

The DEM of the 9 tiles computed with buffers created by lastile has no edge artifacts acoss tile boundaries.

The DBM of the 9 tiles computed with buffers created by lastile has no edge artifacts acoss tile boundaries.

If you need to deliver the LiDAR files you should remove the buffers with lastile and option ‘-remove_buffer’.

lastile -i 1_classify\futi*.laz ^
        -remove_buffer ^
        -odir 1_final -olaz ^
        -cores 4

2. On-the-fly buffering

Now assume you are given LiDAR tiles without buffers. We generate them here with lastile.

lastile -i ..\data\fusa.laz ^
        -set_classification 0 -set_user_data 0 ^
        -tile_size 100 ^
        -odir 2_raw -o futi.laz

The only difference is that we do not request the 20 meter buffer and the result is a typical tiling as you may receive it from a vendor or download it from a LiDAR portal. The GUI for LAStools shows that there is no overlap and if you scrutinize the bounding box values that the cursor points to, you see that the tiles is exactly 100 meters by 100 meters.

tiles_without_buffer

Now we have to think about buffers a lot. When using on-the-fly buffering we should first spatially index the tiles with lasindex for faster access to the points from neighbouring tiles.

lasindex -i 1_raw\futi*.laz -cores 4

Below in red are the modifications for on-the-fly buffering to the standard workflow of lasground, lasheight, and lasclassify. The first lasground run uses ‘-buffered 20’ to add buffers to each tile and ‘-remain_buffered’ to write those buffers to disk. This way they do not have to created again by lasheight and lasclassify.

lasground -i 2_raw\futi*.laz ^
          -buffered 20 -remain_buffered ^
          -city ^
          -odir 2_ground -olaz ^
          -cores 4
lasheight -i 2_ground\futi*.laz ^
          -remain_buffered ^
          -drop_above 50 ^
          -odir 2_height -olaz ^
          -cores 4
lasclassify -i 2_height\futi*.laz ^
            -remain_buffered ^
            -odir 2_classify -olaz ^
            -cores 4

At the end we have to remember that the tiles still have on-the-fly buffers and them cut off with option ‘-use_orig_bb’ of las2dem.

las2dem -i 2_classify\futi*.laz ^
        -keep_class 2 6 ^
        -step 0.25 -use_orig_bb ^
        -odir 2_dbm -obil ^
        -cores 4

Again, we created a digital terrain model with buildings (DBM) by keeping the points with classification 2 (ground) and 6 (building). The resulting hillshade computed from a virtual raster that combines the 9 BIL rastera into one looks perfectly smooth in QGIS.

The hillshaded DEM of the 9 tiles computed with on-the-fly buffering has no edge artifacts acoss tile boundaries.

The hillshaded DBM of 9 tiles computed with on-the-fly buffering has no edge artifacts acoss tile boundaries.

If you need to deliver the LiDAR files you should probably remove the buffers first … (-:

lastile -i 2_classify\futi*.laz ^
        -remove_buffer ^
        -odir 2_final -olaz ^
        -cores 4

3. Bad: No buffering

Here what you are *not* supposed to do. Assuming you get unbuffered tiles.

lastile -i ..\data\fusa.laz ^
        -set_classification 0 -set_user_data 0 ^
        -tile_size 100 ^
        -odir 3_raw -o futi.laz

Bad. You do not take care about buffering when processing the tiles.

lasground -i 3_raw\futi*.laz ^
          -city ^
          -odir 3_ground -olaz ^
          -cores 4
lasheight -i 3_ground\futi*.laz ^
          -drop_above 50 ^
          -odir 3_height -olaz ^
          -cores 4
lasclassify -i 3_height\futi*.laz ^
            -odir 3_classify -olaz ^
            -cores 4

Bad. You do not take care about buffering when generating the DBM.

las2dem -i 3_classify\futi*.laz ^
        -keep_class 2 6 ^
        -step 0.25 ^
        -odir 3_dbm -obil ^
        -cores 4

Bad. You get crappy results with edge artifacts clearly visible in the hillshade.

The hillshaded DBM of 9 tiles computed WITHOUT using buffers has severe edge artifacts acoss tile boundaries.

The hillshaded DBM of 9 tiles computed WITHOUT using buffers has severe edge artifacts acoss tile boundaries.

Bad. If you zoom in on a corner where 4 tiles meet you find missing pixels and incorrect elevation values. Bad. Bad. Bad. So please folks. Try this on your own data. Notice the horrible edge artifacts. Then always use buffers … (-:

PS: Usually no buffers are needed for running lasgrid, lasoverlap, or lascanopy as they perform simple binning operations that do not make use of neighbour information.

Five Myths about LAS, LAZ, and “Optimized LAS”

The Open Letter by OSGeo was delivered to ESRI, OGC, and the ASPRS last week and the initial reponses – including an email from ESRI’s founder and president Jack Dangermond – are very encouraging. Attendees of last weeks’ ASPRS conference were discussing how to respond to ESRI’s proprietary “Optimized LAS” that threatens the achievements of the open LiDAR formats LAS and LAZ that the community has been using for many years now. Below five clarifications to five wrong statements overheard at these meetings:

1) Martin’s “LAZ” format is also proprietary.

Wrong. LAZ – just like LAS – is an open format. LAZ is defined by a well commented open reference implementation in C/C++ and described in a PE&RS paper published in February 2013. LAS is defined via a specification document but has no reference implementation. Both can be freely used by anyone and (re-)implemented on any operating system and in any programming language. For example, there is now a javascript version of LAZ that someone else created.

2) We have no argument because ESRI provides a free API for “Optimized LAS”.

Wrong. “Optimized LAS” can only be used via the mechanism, the programming language, and the operating system of ESRI’s choosing. This is the very definition of “proprietary format”. Here is what Wikipedia says:

A proprietary format is a file format of a company, organization, or individual that contains data that is ordered and stored according to a particular encoding-scheme, designed by the company or organization to be secret, such that the decoding and interpretation of this stored data is only easily accomplished with particular software or hardware that the company itself has developed. The specification of the data encoding format is not released, or underlies non-disclosure agreements.

In contrast an open format is a file format that is published and free to be used by everybody.

3) Martin’s “LAZ” format is only used by LAStools.

Wrong. Large parts of the LiDAR industry embrace LAZ and have added read & write support for the LAZ format using the open source code or the DLL. Examples are QT Modeler, Globalmapper, FME, Fugroviewer, ERDAS IMAGINE, ENVI LiDAR, Bentley Pointools, TopoDOT, FUSION, CloudCompare, Gexel R3, Pointfuse, …and many more. Notable exceptions are ArcGIS and the product line offered by Lewis Graham’s GeoCue group. We maintain an (incomplete) list of software with native LAZ support here.

4) ESRI has engineered “Optimized LAS” for the cloud and “LAZ” cannot compete.

Wrong. The extra functionality in “Optimized LAS” is a simple mash-up of LAZ with spatial indexing LAX, an optional spatial sort, and a few extra statistics. This is why ESRI’s format is also known as the “LAZ clone”. We were able to feature-match these minor engineering changes in an afternoon which – a few days later – resulted in this April Fools’ Day prank. In fact, LAZ has been used “in the cloud” for well over 4 years on OpenTopography – the first and probably the premier Web accessible LiDAR cloud service of our industry. It is also used by many other LiDAR download servers. We maintain an (incomplete) list of portals offering compressed LAZ here.

5) ESRI’s “Optimized LAS” does not prevent people from using LAS.

ESRI is one of the largest GIS training organizations. If they teach hundreds of LiDAR novices to “optimize” their “unoptimized LAS” files while simultaneously lobbying large LiDAR providers into switching from LAS or LAZ to zLAS they will effectively destroy the current success of our open formats. ESRI’s command of the GIS market can – little by little – turn their own proprietry format into the dominant way in which LiDAR point clouds are exchanged. Then we loose our open exchange formats. Hence, ESRI’s proprietary “Optimized LAS” format “threatens” what we have achieved with LAS (and LAZ): open LiDAR data exchange and incredible LiDAR software interoperability.

This is not an anti-ESRI campaign. We hope to work with ESRI to resolve this situation. Below an image and a quote from ESRI’s ArcNews Spring 2011 news letter about the importance of open formats, standards, and specifications …

ESRI: "Esri continues to advocate the need for open access to geographic data and functionality through support for widely adopted and practical standards and specifications. Esri follows an open system strategy for accessing and using geographic data and functionality."

“Esri continues to advocate the need for open access to geographic data and functionality through support for widely adopted and practical standards and specifications. Esri follows an open system strategy for accessing and using geographic data and functionality.” — ArcNews, Spring 2011

New LASliberator “frees” LiDAR from Closed Format

PRESS RELEASE (for immediate release)
April 20, 2015
rapidlasso GmbH, Gilching, Germany

The latest product by rapidlasso GmbH – creators of LAStools and LASzip – is an open source tool aiming to liberate LiDAR points locked-up in proprietary “Optimized LAS” – a highly controversial, closed LiDAR format. The new LASliberator can be downloaded here. It comes as both, a simple command line tool for scripting and with an easy-to-use graphical interface.

The GUI version of the LASliberaor has a simple and easy-to-use interface.

The GUI of the “LASliberator” has a simple, easy-to-use interface.

The LASliberator reads LiDAR points from closed “Optimized LAS” files that use the “.zlas” extension and converts them to open ASPRS LAS files that use the “.las” extension. Alternatively, the points can be stored to compressed LAZ files – using the open source LASzip compressor – that use the “.laz” extension. In addition, the tool creates tiny spatial indexing files that use the “.lax” extension. These can then be exploited for accelerated area-of-interest queries via open source LASindex when using LAStools or the latest version of the LASzip DLL.

Note that the LASliberator cannot entirely be open source as it depends on a particular proprietry library. The closed nature of the “Optimized LAS” format does not allow for a full open source implementation. It is therefore not possible to port the LASliberator to other operating systems or into other programming languages.

Selecing open in the GUI pops up a file selection dialogue allowing the user to find the file that is to be set free.

The user can select a file to liberate by pressing “open” in the GUI.

The new LASliberator comes on the heels of an outcry in the community over the LiDAR format fragmentation “Optimized LAS” is creating. It provides an immediate solution to go from closed zLAS to open LAZ for people whose LiDAR got stuck in yet-another-proprietary-format.

About rapidlasso GmbH:
Technology powerhouse rapidlasso GmbH specializes in efficient LiDAR processing tools that are widely known for their high productivity. They combine robust algorithms with efficient I/O and clever memory management to achieve high throughput for data sets containing billions of points. The company’s flagship product – the LAStools software suite – has deep market penetration and is heavily used in industry, government agencies, research labs, and educational institutions. Visit http://rapidlasso.com for more information.