Removing Low Noise from RIEGL’s VUX-1 UAV LiDAR flown in the Philippines

In this tutorial we are removing some “tricky” low noise from LiDAR point clouds in order to produce a high-resolution Digital Terrain Model (DTM). The data was flown above a tropical beach and mangrove area in the Philippines using a VUX-1 UAV based system from RIEGL mounted on a helicopter. The survey was done as a test flight by AB Surveying who have the capacity to fly such missions in the Philippines and who have allowed us to share this data with you for educational purposes. You can download the data (1 GB) here. It covers a popular twin beach knows as “Nacpan” near El Nido in Palawan (that we happen to have visited in 2014).

A typical beach fringed by coconut palms in Palawan, Philippines.

We start our usual quality check with a run of lasinfo. We add the ‘-cd’ switch to compute an average point density and the ‘-histo gps_time 1’ switch to produce a 1 second histogram for the GPS time stamps.

lasinfo -i lalutaya.laz ^
        -cd ^
        -histo gps_time 1 ^
        -odix _info -otxt

You can download the resulting lasinfo report here. It tells us that there are 118,740,310 points of type 3 (with RGB colors) with an average density of 57 last returns per square meter. The point coordinates are in the “PRS92 / Philippines 1” projection with EPSG code 3121 that is based on the “Clarke 1866” ellipsoid.

Datum Transform

We prefer to work in an UTM projection based on the “WGS 1984” ellipsoid, so we will first perform a datum transform based on the seven parameter Helmert transformation – a capacity that was recently added to LAStools. For this we first need a transform to get to geocentric or Earth-Centered Earth-Fixed (ECEF) coordinates on the current “Clarke 1866” ellipsoid, then we apply the Helmert transformation that operates on geocentric coordinates and whose parameters are listed in the TOWGS84 string of EPSG code 3121 to get to geocentric or ECEF coordinates on the “WGS 1984” ellipsoid. Finally we can convert the coordinates to the respective UTM zone. These three calls to las2las accomplish this.

las2las -i lalutaya.laz ^
        -remove_all_vlrs ^
        -epsg 3121 ^
        -target_ecef ^
        -odix _ecef_clark1866 -olaz

las2las -i lalutaya_ecef_clark1866.laz ^
        -transform_helmert -127.62,-67.24,-47.04,-3.068,4.903,1.578,-1.06 ^
        -wgs84 -ecef ^
        -ocut 10 -odix _wgs84 -olaz
 
las2las -i lalutaya_ecef_wgs84.laz ^
        -target_utm auto ^
        -ocut 11 -odix _utm -olaz

In these steps we implicitly reduced the resolution that each coordinate was stored with from quarter-millimeters (i.e. scale factors of 0.00025) to the default of centimeters (i.e. scale factors of 0.01) which should be sufficient for subsequent vegetation analysis. The LAZ files also compress better when coordinates a lower resolution so that the ‘lalutaya_utm.laz’ file is over 200 MB smaller than the original ‘lalutaya.laz’ file. The quantization error this introduces is probably still below the overall scanning error of this helicopter survey.

Flightline Recovery

Playing back the file visually with lasview suggests that it contains more than one flightline. Unfortunately the point source ID field of the file is not properly populated with flightline information. However, when scrutinizing the GPS time stamp histogram in the lasinfo report we can see an occasional gap. We highlight two of these gaps in red between GPS second 540226 and 540272 and GPS second 540497 and 540556 in this excerpt from the lasinfo report:

gps_time histogram with bin size 1
[...]
 bin 540224 has 125768
 bin 540225 has 116372
 bin 540226 has 2707
 bin 540272 has 159429
 bin 540273 has 272049
 bin 540274 has 280237
[...]
 bin 540495 has 187103
 bin 540496 has 180421
 bin 540497 has 126835
 bin 540556 has 228503
 bin 540557 has 275025
 bin 540558 has 273861
[...]

We can use lasplit to recover the original flightlines based on gaps in the continuity of GPS time stamps that are bigger than 10 seconds:

lassplit -i lalutaya_utm.laz ^
         -recover_flightlines_interval 10 ^
         -odir strips_raw -o lalutaya.laz

This operation splits the points into 11 separate flightlines. The points within each flightline are stored in the order that the vendor software – which was RiPROCESS 1.7.2 from RIEGL according to the lasinfo report – had written them to file. We can use lassort to bring them back into the order they were acquired in by sorting first on the GPS time stamp and then on the return number field:

lassort -i strips_raw\*.laz ^
        -gps_time -return_number ^
        -odir strips_sorted -olaz ^
        -cores 4

Now we turn the sorted flightlines into tiles (with buffers !!!) for further processing. We also erase the current classification of the data into ground (2) and medium vegetation (4) as a quick visual inspection with lasview immediately shows that those are not correct:

lastile -i strips_sorted\*.laz ^
        -files_are_flightlines ^
        -set_classification 0 ^
        -tile_size 250 -buffer 30 -flag_as_withheld ^
        -odir tiles_raw -o lalu.laz

Quality Checking

Next comes the standard check of flightline overlap and alignment check with lasoverlap. Once more it become clear why it is so important to have flightline information. Without we may have missed what we are about to notice. We create false color images load into Google Earth to visually assess the situation. We map all absolute differences between flightlines below 5 cm to white and all absolute differences above 30 cm to saturated red (positive) or blue (negative) with a gradual shading from white to red or blue for any differences in between. We also create an overview KML file that lets us quickly see in which tile we can find the points for a particular area of interest with lasboundary.

lasoverlap -i tiles_raw\*.laz ^
           -step 1 -min_diff 0.05 -max_diff 0.30 ^
           -odir quality -opng ^
           -cores 4

lasboundary -i tiles_raw\*.laz ^
            -use_tile_bb -overview -labels ^
            -o quality\overview.kml

The resulting visualizations show (a) that our datum transform to the WGS84 ellipsoid worked because the imagery aligns nicely with Google Earth and (b) that there are several issues in the data that require further scrutiny.

In general the data seems well aligned (most open areas are white) but there are blue and red lines crossing the survey area. With lasview have a closer look at the visible blue lines running along the beach in tile ‘lalu_765000_1252750.laz’ by repeatedly pressing ‘x’ to select a different subset and ‘x’ again to view this subset up close while pressing ‘c’ to color it differently:

lasview -i tiles_raw\lalu_765000_1252750.laz

These lines of erroneous points do not only happen along the beach but also in the middle of and below the vegetation as can be seen below:

Our initial hope was to use the higher than usual intensity of these erroneous points to reclassify them to some classification code that we would them exclude from further processing. Visually we found that a reasonable cut-off value for this tile would be an intensity above 35000:

lasview -i tiles_raw\lalu_765000_1252750.laz ^
        -keep_intensity_above 35000 ^
        -filtered_transform ^
        -set_classification 23

However, while this method seems successful on the tile shown above it fails miserably on others such as ‘lalu_764250_1251500.laz’ where large parts of the beach are very reflective and result in high intensity returns to to the dry white sand:

lasview -i tiles_raw\lalu_764250_1251500.laz ^
        -keep_intensity_above 35000 ^
        -filtered_transform ^
        -set_classification 23

Low Noise Removal

In the following we describe a workflow that can remove the erroneous points below the ground so that we can at least construct a high-quality DTM from the data. This will not, however, remove the erroneous points above the ground so a subsequent vegetation analysis would still be affected. Our approach is based on two obervations (a) the erroneous points affect only a relatively small area and (b) different flightlines have their erroneous points in different areas. The idea is to compute a set of coarser ground points separately for each flightline and – when combining them in the end – to pick higher ground points over lower ones. The combined points should then define a surface that is above the erroneous below-ground points so that we can mark them with lasheight as not to be used for the actual ground classification done thereafter.

The new huge_las_file_extract_ground_points_only.bat example batch script that you can download here does all the work needed to compute a set of coarser ground points for each flightline. Simply edit the file such that the LAStools variable points to your LAStools\bin folder and rename it to end with the *.bat extension. Then run:

huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000001.laz strips_ground_only\lalutaya_0000001.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000002.laz strips_ground_only\lalutaya_0000001.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000003.laz strips_ground_only\lalutaya_0000001.laz
...
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000009.laz strips_ground_only\lalutaya_0000009.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000010.laz strips_ground_only\lalutaya_0000010.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000011.laz strips_ground_only\lalutaya_0000011.laz

The details on how this batch script works – a pretty standard tile-based multi-core processing workflow – are given as comments in this batch script. Now we have a set of individual ground points computed separately for each flightline and some will include erroneous points below the ground that the lasground algorithm by its very nature is likely to latch on to as you can see here:

The trick is now to utilize the redundancy of multiple scans per area and – when combining flightlines – to pick higher rather than lower ground points in overlap areas by using the ground point closest to the 75th elevation percentile per 2 meter by 2 meter area with at least 3 or more points with lasthin:

lasthin -i strips_ground_only\*.laz -merged ^
        -step 2 -percentile 75 3 ^
        -o lalutaya_ground_only_2m_75_3.laz

There are still some non-ground points in the result as ground-classifying of flightlines individually often results in vegetation returns being included in sparse areas along the edges of the flight lines but we can easily get rid of those:

lasground_new -i lalutaya__ground_only_2m_75_3.laz ^
              -town -hyper_fine ^
              -odix _g -olaz

We sort the remaining ground points into a space-filling curve order with lassort and spatially index them with lasindex so they can be efficiently accessed by lasheight in the next step.

lassort -i lalutaya__ground_only_2m_75_3_g.laz ^
        -keep_class 2 ^
        -o lalutaya_ground.laz

lasindex -i lalutaya_ground.laz

Finally we have the means to robustly remove the erroneous points below the ground from all tiles. We use lasheight with the ground points we’ve just so painstakingly computed to classify all points 20 cm or more below the ground surface they define into classification code 23. Later we simply can ignore this classification code during processing:

lasheight -i tiles_raw\*.laz ^
          -ground_points lalutaya_ground.laz ^
          -do_not_store_in_user_data ^
          -classify_below -0.2 23 ^
          -odir tiles_cleaned -olaz ^
          -cores 4

Rather than trying to ground classify all remaining points we run lasground on a thinned subset of all points. For this we mark the lowest point in every 20 cm by 20 cm grid cell with some temporary classification code such as 6.

lasthin -i tiles_cleaned\*.laz ^
        -ignore_class 23 ^
        -step 0.20 -lowest -classify_as 6 ^
        -odir tiles_thinned -olaz ^
        -cores 4

Finally we can run lasground to compute the ground classification considering all points with classification code 6 by ignoring all points with classification codes 23 and 0.

lasground_new -i tiles_thinned\*.laz ^
              -ignore_class 23 0 ^
              -city -hyper_fine ^
              -odir tiles_ground_new -olaz ^
              -cores 4

And finally we can create a DTM with a resolution of 25 cm using las2dem and the result is truly beautiful:

las2dem -i tiles_ground_new\*.laz ^
        -keep_class 2 ^
        -step 0.25 -use_tile_bb ^
        -odir tiles_dtm_25cm -obil ^
        -cores 4

We have to admit that a few bumps are left (see mouse cursor below) but adjusting the parameters presented here is left as an exercise to the reader.

We would again like to acknowledge AB Surveying whose generosity has made this blog article possible. They have the capacity to fly such missions in the Philippines and who have allowed us to share this data with you for educational purposes.

Plots to Stands: Producing LiDAR Vegetation Metrics for Imputation Calculations

Some professionals in remote sensing find LAStools a useful tool to extract statistical metrics from LiDAR that are used to make estimations about a larger area of land from a small set of sample plots. Common applications are prediction of the timber volume or the above-ground biomass for entire forests based on a number of representative plots where exact measurements were obtained with field work. The same technique can also be used to make estimations about animal habitat or coconut yield or to classify the type of vegetation that covers the land. In this tutorial we describe the typical workflow for computing common metrics for smaller plots and larger areas using LAStools.

Download these six LiDAR tiles (1, 2, 3, 4, 5, 6) from a Eucalyptus plantation in Brazil to follow along the step by step instructions of this tutorial. This data is courtesy of Suzano Pulp and Paper. Please also download the two shapefiles that delineate the plots where field measurements were taken and the stands for which predictions are to be made. You should download version 170327 (or higher) of LAStools due to some recent bug fixes.

Quality Checking

Before processing newly received LiDAR data we always perform a quality check first. This ranges from visual inspection with lasview, to printing textual content reports and attribute histograms with lasinfo, to flight-line alignment checks with lasoverlap, pulse density and pulse spacing checks with lasgrid and las2dem, and completeness-of-returns check with lassort followed by lasreturn.

lasinfo -i tiles_raw\CODL0003-C0006.laz ^
        -odir quality -odix _info -otxt

The lasinfo report tells us that there is no projection information. However, we remember that this Brazilian data was in the common SIRGAS 2000 projection and try for a few likely UTM zones whether the hillshaded DSM produced by las2dem falls onto the right spot in Google Earth.

las2dem -i tiles_raw\CODL0003-C0006.laz ^
        -keep_first -thin_with_grid 1 ^
        -hillshade -epsg 31983 ^
        -o epsg_check.png

Hillshaded DSM and Google Earth imagery align for EPSG code 31983

The lasinfo report also tells us that the xyz coordinates are stored with millimeter resolution which is a bit of an overkill. For higher and faster LASzip compression we will later lower this to a more appropriate centimeter resolution. It further tells us that the returns are stored using point type 0 and that is a bit unfortunate. This (older) point type does not have a GPS time stamp so that some quality checks (e.g. “completeness of returns” with lasreturn) and operations (e.g. “resorting of returns into acquisition order” with lassort) will not be possible. Fortunately the min-max range of the ‘point source ID’ suggests that this point attribute is correctly populated with flightline numbers so that we can do a check for overlap and alignment of the different flightlines that contribute to the LiDAR in each tile.

lasoverlap -i tiles_raw\*.laz ^
           -min_diff 0.2 -max_diff 0.4 ^
           -epsg 31983 ^
           -odir quality -opng ^
           -cores 3

We run lasoverlap to visualize the amount of overlap between flightlines and the vertical differences between them. The produced images (see below) color code the number of flightlines and the maximum vertical difference between any two flightlines as seen below. Most of the area is cyan (2 flightlines) except in the bottom left where the pilot was sloppy and left some gaps in the yellow seams (3 flightlines) so that some spots are only blue (1 flightline). We also see that two tiles in the upper left are partly covered by a diagonal flightline. We will drop that flightline later to create a more uniform density.across the tiles. The mostly blue areas in the difference are mostly aligned with features in the landscape and less with the flightline pattern. Closer inspection shows that these vertical difference occur mainly in the dense old growth forests with species of different heights that are much harder to penetrate by the laser than the uniform and short-lived Eucalyptus plantation that is more of a “dead forest” with little undergrowth or animal habitat.

Interesting observation: The vertical difference of the lowest return from different flightlines computed per 2 meter by 2 meter grid cell could maybe be used a new forestry metric to help distinguish mono cultures from natural forests.

lasgrid -i tiles_raw\*.laz ^
        -keep_last ^
        -step 2 -point_density ^
        -false -set_min_max 10 20 ^
        -epsg 31983 ^
        -odir quality -odix _d_2m_10_20 -opng ^
        -cores 3

lasgrid -i tiles_raw\*.laz ^
        -keep_last ^
        -step 5 -point_density ^
        -false -set_min_max 10 20 ^
        -epsg 31983 ^
        -odir quality -odix _d_5m_10_20 -opng ^
        -cores 3

We run lasgrid to visualize the pulse density per 2 by 2 meter cell and per 5 by 5 meter cell. The produced images (see below) color code the number of last return per square meter. The impact of the tall Eucalyptus trees on the density per cell computation is evident for the smaller 2 meter cell size in form of a noisy blue/red diagonal in the top right as well as a noisy blue/red area in the bottom left. Both of those turn to a more consistent yellow for the density per cell computation with larger 5 meter cells. Immediately evident is the higher density (red) for those parts or the two tiles in the upper left that are covered by the additional diagonal flightline. The blueish area left to the center of the image suggests a consistently lower pulse density whose cause remains to be investigated: Lower reflectivity? Missing last returns? Cloud cover?

The lasinfo report suggests that the tiles are already classified. We could either use the ground classification provided by the vendor or re-classify the data ourselves (using lastilelasnoise, and lasground). We check the quality of the ground classification by visually inspecting a hillshaded DTM created with las2dem from the ground returns. We buffer the tiles on-the-fly for a seamless hillshade without artifacts along tile boundaries by adding ‘-buffered 25’ and ‘-use_orig_bb’ to the command-line. To speed up reading the 25 meter buffers from neighboring tiles we first create a spatial indexing with lasindex.

lasindex -i tiles_raw\*.laz ^
         -cores 3

las2dem -i tiles_raw\*.laz ^
        -buffered 25 ^
        -keep_class 2 -thin_with_grid 0.5 ^
        -use_orig_bb ^
        -hillshade -epsg 31983 ^
        -odir quality -odix _dtm -opng ^
        -cores 3

hillshaded DTM tiles generated with las2dem and on-the-fly buffering

The resulting hillshaded DTM shows a few minor issues in the ground classification but also a big bump (above the mouse cursor). Closer inspection of this area (you can cut it from the larger tile using the command-line below) shows that there is a group of miss-classified points about 1200 meters below the terrain. Hence, we will start from scratch to prepare the data for the extraction of forestry metrics.

las2las -i tiles_raw\CODL0004-C0006.laz ^
        -inside_tile 207900 7358350 100 ^
        -o bump.laz

lasview -i bump.laz

bump in hillshaded DTM caused by miss-classified low noise

Data Preparation

Using lastile we first tile the data into smaller 500 meter by 500 meter tiles with 25 meter buffer while flagging all points in the buffer as ‘withheld’. In the same step we lower the resolution to centimeter and put nicer a coordinate offset in the LAS header. We also remove the existing classification and classify all points that are much lower than the target terrain as class 7 (aka noise). We also add CRS information and give each tile the base name ‘suzana.laz’.

lastile -i tiles_raw\*.laz ^
        -rescale 0.01 0.01 0.01 ^
        -auto_reoffset ^
        -set_classification 0 ^
        -classify_z_below_as 500.0 7 ^
        -tile_size 500 ^
        -buffer 25 -flag_as_withheld ^
        -epsg 31983 ^
        -odir tiles_buffered -o suzana.laz

With lasnoise we mark the many isolated points that are high above or below the terrain as class 7 (aka noise). Using two tiles we played around with the ‘step’ parameters until we find good parameter settings. See the README of lasnoise for the exact meaning and the choice of parameters for noise classification. We look at one of the resulting tiles with lasview.

lasnoise -i tiles_buffered\*.laz ^
         -step_xy 4 -step_z 2 ^
         -odir tiles_denoised -olaz ^
         -cores 3

lasview -i tiles_denoised\suzana_206000_7357000.laz ^
        -color_by_classification ^
        -win 1024 192

noise points shown in pink: all points (top), only noise points (bottom)

Next we use lasground to classify the last returns into ground (2) and non-ground (1). It is important to ignore the noise points with classification 7 to avoid the kind of bump we saw in the vendor-delivered classification. We again check the quality of the computed ground classification by producing a hillshaded DTM with las2dem. Here the las2dem command-line is sightly different as instead of buffering on-the-fly we use the buffers stored with each tile.

lasground -i tiles_denoised\*.laz ^
          -ignore_class 7 ^
          -nature -extra_fine ^
          -odir tiles_ground -olaz ^
          -cores 3

las2dem -i tiles_ground\*.laz ^
        -keep_class 2 -thin_with_grid 0.5 ^
        -hillshade ^
        -use_tile_bb ^
        -odir quality -odix _dtm_new -opng ^
        -cores 3

Finally, with lasheight we compute how high each return is above the triangulated surface of all ground returns and store this height value in place of the elevation value into the z coordinate using the ‘-replace_z’ switch. This height-normalizes the LiDAR in the sense that all ground returns are set to an elevation of 0 while all other returns get an elevation relative to the ground. The result are height-normalized LiDAR tiles that are ready for producing the desired forestry metrics.

lasheight -i tiles_ground\*.laz ^
          -replace_z ^
          -odir tiles_normalized -olaz ^
          -cores 3
Metric Production

The tool for computing the metrics for the entire area as well as for the individual field plots is lascanopy. Which metrics are well suited for your particular imputation calculation is your job to determine. Maybe first compute a large number of them and then eliminate the redundant ones. Do not use any point from the tile buffers for these calculations. We had flagged them as ‘withheld’ during the lastile operation, so they are easy to drop. We also want to drop the noise points that were classified as 7. And we were planning to drop that additional diagonal flightline we noticed during quality checking. We generated two lasinfo reports with the ‘-histo point_source 1’ option for the two tiles it was covering. From the two histograms it was easy to see that the diagonal flightline has the number 31.

First we run lascanopy on the 11 plots that you can download here. When running on plots it makes sense to first create a spatial indexing with lasindex for faster querying so that only those tiny parts of the LAZ file need to be loaded that actually cover the plots.

lasindex -i tiles_normalized\*.laz ^
         -cores 3

lascanopy -i tiles_normalized\*.laz -merged ^
          -drop_withheld ^
          -drop_class 7 ^
          -drop_point_source 31 ^
          -lop WKS_PLOTS.shp ^
          -cover_cutoff 3.0 ^
          -cov -dns ^
          -height_cutoff 2.0 ^
          -c 2.0 999.0 ^
          -max -avg -std -kur ^
          -p 25 50 75 95 ^
          -b 30 50 80 ^
          -d 2.0 5.0 10.0 50.0 ^
          -o plots.csv

The resulting ‘plots.csv’ file you can easily process in other software packages. It contains one line for each polygonal plot listed in the shapefile that lists its bounding box followed by all the requested metrics. But is why is there a zero maximum height (marked in orange) for plots 6 though 10? All height metrics are computed solely from returns that are higher than the ‘height_cutoff’ that was set to 2 meters. We added the ‘-c 2.0 999.0’ absolute count metric to keep track of the number of returns used in these calculations. Apparently in plots 6 though 10 there was not a single return above 2 meters as the count (also marked in orange) is zero for all these plots. Turns out this Eucalyptus stand had recently been harvested and the new seedlings are still shorter than 2 meters.

more plots.csv
index,min_x,min_y,max_x,max_y,max,avg,std,kur,p25,p50,p75,p95,b30,b50,b80,c00,d00,d01,d02,cov,dns
0,206260.500,7358289.909,206283.068,7358312.477,11.23,6.22,1.91,2.26,4.71,6.01,7.67,9.5,26.3,59.7,94.2,5359,18.9,41.3,1.5,76.3,60.0
1,206422.500,7357972.909,206445.068,7357995.477,13.54,7.5,2.54,1.97,5.32,7.34,9.65,11.62,26.9,54.6,92.2,7030,12.3,36.6,13.3,77.0,61.0
2,206579.501,7358125.909,206602.068,7358148.477,12.22,5.72,2.15,2.5,4.11,5.32,7.26,9.76,46.0,73.7,97.4,4901,24.8,28.7,2.0,66.8,51.2
3,206578.500,7358452.910,206601.068,7358475.477,12.21,5.68,2.23,2.64,4.01,5.14,7.18,10.04,48.3,74.1,95.5,4861,25.7,26.2,2.9,68.0,50.2
4,206734.501,7358604.910,206757.068,7358627.478,15.98,10.3,2.18,2.64,8.85,10.46,11.9,13.65,3.3,27.0,91.0,4946,0.6,32.5,44.5,91.0,77.5
5,207043.501,7358761.910,207066.068,7358784.478,15.76,10.78,2.32,3.43,9.27,11.03,12.49,14.11,3.2,20.7,83.3,4819,1.5,24.7,51.0,91.1,76.8
6,207677.192,7359630.526,207699.760,7359653.094,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
7,207519.291,7359372.366,207541.859,7359394.934,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
8,207786.742,7359255.850,207809.309,7359278.417,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
9,208159.017,7358997.344,208181.584,7359019.911,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
10,208370.909,7358602.565,208393.477,7358625.133,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0

Then we run lascanopy on the entire area and produce one raster per tile for each metric. Here we remove the buffered points with the ‘-use_tile_bb’ switch that also ensures that the produced rasters have exactly the extend of the tiles without buffers. Again, it is imperative that you download the version 170327 (or higher) of LAStools for this to work correctly.

lascanopy -version
LAStools (by martin@rapidlasso.com) version 170327 (academic)

lascanopy -i tiles_normalized\*.laz ^
          -use_tile_bb ^
          -drop_class 7 ^
          -drop_point_source 31 ^
          -step 10 ^
          -cover_cutoff 3.0 ^
          -cov -dns ^
          -height_cutoff 2.0 ^
          -c 2.0 999.0 ^
          -max -avg -std -kur ^
          -p 25 50 75 95 ^
          -b 30 50 80 ^
          -d 2.0 5.0 10.0 50.0 ^
          -odir tile_metrics -oasc ^
          -cores 3

The resulting rasters in ASC format can easily be previewed using lasview for some “sanity checking” that our metrics make sense and to get a quick overview about what these metrics look like.

lasview -i tile_metrics\suzana_*max.asc
lasview -i tile_metrics\suzana_*p95.asc
lasview -i tile_metrics\suzana_*p50.asc
lasview -i tile_metrics\suzana_*p25.asc
lasview -i tile_metrics\suzana_*cov.asc
lasview -i tile_metrics\suzana_*d00.asc
lasview -i tile_metrics\suzana_*d01.asc
lasview -i tile_metrics\suzana_*b30.asc
lasview -i tile_metrics\suzana_*b80.asc

The maximum height rasters are useful to inspect more closely as they will immediately tell us if there was any high noise point that slipped through the cracks. And indeed it happened as we see a maximum of 388.55 meters for of the 10 by 10 meter cells. As we know the expected height of the trees we could have added a ‘-drop_z_above 70’ to the lascanopy command line. Careful, however, when computing forestry metrics in strongly sloped terrains as the terrain slope can significantly lift up returns to heights much higher than that of the tree. This is guaranteed to happen for LiDAR returns from branches that are extending horizontally far over the down-sloped part of the terrain as shown in this paper here.

We did not use the shapefile for the stands in this exercise. We could have clipped the normalized LiDAR points to these stands using lasclip as shown in the GUI below before generating the raster metrics. This would have saved space and computation time as many of the LiDAR points lie outside of the stands. However, it might be better to do that clipping step on the rasters in whichever GIS software or statistics package you are using for the imputation computation to properly account for partly covered raster cells along the stand boundary. This could be subject of another blog article … (-:

not all LiDAR was needed to compute metrics for

Fixing Intensity Differences between Flightlines (“quick and dirty”)

Visiting our users on-site, such as last week at Mariano Marcos State University in Ilocos Norte in the Philippines, we sometimes come across situations as pictured below where the intensity values of the returns of one flightline are drastically different from that of other flightlines.

The intensity of returns in the left most flightline is different from that of other flightlines.

The intensity of returns in the left most flightline is different from that of other flightlines.

Using intensity rasters with such dark strips as an additional input for land cover classification may likely make things worse. Radiometrically “correct” intensity calibration is a complex topic and may not always be possible to do using only the LAZ files without meta information such as the internals of the scanning system and the aircraft trajectory. However, we now describe a “quick and dirty” fix to the situation shown above so that the intensity grids (that were computed as averages of first return intensities) at least “look” as sensible as for the one square tile (shown below) that was corrected by a simple multiplication with 5 for all intensities of the dark strip.

Simply multiplying all intensities of the dark flightline with 5 seems to "fix" the issue.

Simply multiplying all intensities of the dark flightline with 5 seems to “fix” the issue for our test tile.

The number 5 was determined by a quick glance at the intensity histograms that we can generate with lasinfo. We decide to only look at single returns as we expect them to have a higher correlation: Their locations are more likely to be “seen similarly” from and their energy is more likely “reflected similarly” to different flightlines compared to that of multiple returns.

lasinfo -i strip1.laz strip2.laz strip3.laz ^
        -keep_single ^
        -histo intensity 1 ^
        -nmm -nh -nv ^
        -odix _histo_int -otxt

The resulting histograms for the dark ‘strip1.laz’ is quite different from that of the much brighter ‘strip2.laz’ and ‘strip3.laz’. The average single return intensity for the dark ‘strip1.laz’ is a meager 5.13 whereas the  brighter ‘strip2.laz’ and ‘strip3.laz’ have similar averages of 24.15 and 24.50 respectively.

Draw your own conclusion about which scale factor to use. We have the choice to match either the peak of the histograms or their averages. Scaling the peak of 3 for ‘strip1.laz’ to match the 25 of the other two strips is too much of an upscaling. But the average 24.15 divided by 5.13 gives a potential scale of 4.71 and the average 24.50 divided by 5.13 gives a potential scale of 4.77 and we already saw a multiplication by 5 giving reasonable results. So this is how we can fix the intensity:

las2las -i strip1.laz ^
        -scale_intensity 4.75 ^
        -odix _corr_int -olaz

But what if your data is already in tiles? How can you adjust only the intensity of those returns that are from the flightline 1? Assuming that your flightline information is properly stored in the point source ID field of every point this is easily done with the new ‘-filtered_transform’ in LAStools using las2las on as many cores as you have as follows:

las2las -i tiles/*.laz ^
        -keep_point_source 1 ^
        -filtered_transform ^
        -scale_intensity 4.75 ^
        -odir tiles_corr -olaz ^
        -cores 8

This is not currently exposed in the GUI of las2las but you can simply add it by typing it into the ‘RUN’ pop-up window as shown below.

Scaling only the intensities of flightline 1 by 4.9 using the new '-filtered_transform'.

Scaling only the intensities of flightline 1 by 4.9 using the new ‘-filtered_transform’.

After this “quick and dirty” intensity correction we again ran lasgrid as follows:

lasgrid -i tiles_corr/*.laz ^
        -gray -set_min_max 0 60 ^
        -odir tiles_int_rasters -opng ^
        -cores 8

And the result is shown below. The obvious flightline-induced discontinuity in the intensities has pretty much disappeared. Do you have similar flightline-related intensity issues? We like to hear from you whether this technique works or if we need to implement something more clever in the future …

lasgrid_intensity_differences_3

Create proper LAS 1.4 files with LAStools (for free)

So your LiDAR processing workflow produces beautiful LAS or LAZ output. You think the files are nothing short of perfect. But they are in LAS 1.2 format and the tender document explicitly requests delivery in the latest LAS 1.4 format. The free and open source las2las module of LAStools – also known as the Swiss Army knife of LiDAR processing – can easily up-convert them.

There are two possible scenarios: (1) The client wants the data as LAS 1.4 files but does not care about which point type is used. (2) The client wants LAS 1.4 and specifically asks for one of the new point types such as point type 6.

Let us assume you have LAS 1.2 content with point type 1 like these LAZ files from Martin county in Minnesota. Use these four tiles 5126-05-42.laz5126-05-43.laz5126-06-42.laz, and 5126-06-43.laz to repeat the LAS 1.4 up-conversion we are doing in the following. You will need to use version 160110 of LAStools (or newer) that you can download here.

A report generated with lasinfo shows geo-referencing information stored as GeoTIFF keys in the first VLR. Then there are 10 useless numbers stored in the second VLR that seem left-over from an earlier version of geo-referencing without EPSG codes. The two vendor-specific ‘NIIRS10’ VLRs should probably be removed unless they are meaningful to you. The LAS header is followed by 1564 user-defined bytes (sometimes used as “padding”) that we can also delete before delivery.

D:\LAStools\bin>lasinfo -i martin\5126-05-42.laz
reporting all LAS header entries:
 file signature: 'LASF'
 file source ID: 0
 global_encoding: 0
 project ID GUID data 1-4: E603549E-B74A-4696-3BAD-EA49F643C221
 version major.minor: 1.2
 system identifier: 'LAStools (c) by Martin Isenburg'
 generating software: 'lassort (110815) unlicensed'
 file creation day/year: 119/2011
 header size: 227
 offset to point data: 2163
 number var. length records: 4
 point data format: 1
 point data record length: 28
 number of point records: 13772409
 number of points by return: 13440460 240598 78743 12608 0
 scale factor x y z: 0.01 0.01 0.01
 offset x y z: 0 0 0
 min x y z: 361818.33 4855898.31 349.89
 max x y z: 364400.98 4859420.79 404.22
variable length header record 1 of 4:
 reserved 43707
 user ID 'LASF_Projection'
 record ID 34735
 length after header 40
 description 'by LAStools of Martin Isenburg'
 GeoKeyDirectoryTag version 1.1.0 number of keys 4
 key 1024 tiff_tag_location 0 count 1 value_offset 1 - GTModelTypeGeoKey: ModelTypeProjected
 key 3072 tiff_tag_location 0 count 1 value_offset 26915 - ProjectedCSTypeGeoKey: NAD83 / UTM 15N
 key 3076 tiff_tag_location 0 count 1 value_offset 9001 - ProjLinearUnitsGeoKey: Linear_Meter
 key 4099 tiff_tag_location 0 count 1 value_offset 9001 - VerticalUnitsGeoKey: Linear_Meter
variable length header record 2 of 4:
 reserved 43707
 user ID 'LASF_Projection'
 record ID 34736
 length after header 80
 description 'GeoTiff double parameters'
 GeoDoubleParamsTag (number of doubles 10)
 500000 0 -93 0.9996 0 1 6.37814e+006 298.257 0 0.0174533
variable length header record 3 of 4:
 reserved 43707
 user ID 'NIIRS10'
 record ID 1
 length after header 26
 description 'NIIRS10 Tile Index'
variable length header record 4 of 4:
 reserved 43707
 user ID 'NIIRS10'
 record ID 4
 length after header 10
 description 'NIIRS10 Timestamp'
the header is followed by 1564 user-defined bytes
LASzip compression (version 2.1r0 c2 50000): POINT10 2 GPSTIME11 2
reporting minimum and maximum for all LAS point record entries ...
 X 36181833 36440098
 Y 485589831 485942079
 Z 34989 40422
 intensity 1 4780
 return_number 1 4
 number_of_returns 1 4
 edge_of_flight_line 0 0
 scan_direction_flag 0 1
 classification 1 10
 scan_angle_rank -24 24
 user_data 0 32
 point_source_ID 5401 8015
 gps_time 243852.663596 403514.323142
number of first returns: 13440460
number of intermediate returns: 91408
number of last returns: 13440349
number of single returns: 13199808
overview over number of returns of given pulse: 13199808 323647 198449 50505 0 0 0
histogram of classification of points:
 6560507 unclassified (1)
 6744481 ground (2)
 401464 medium vegetation (4)
 55433 building (6)
 100 noise (7)
 10157 water (9)
 267 rail (10)

(1) Create LAS 1.4 files with point type 1

The las2las command shown below turns a folder of LAS 1.2 files into a folder of  LAS 1.4 files with option ‘-set_version 1.4’ without changing the point type. It removes the last three VLRs with option ‘-remove_vlrs_from_to 1 3’ and the user-defined bytes in the LAS header with option ‘-remove_padding’ (*). The result is as LAZ with option ‘-olaz’ and the task is run on up to 4 CPUs in parallel with option ‘-cores 4’.

(*) option ‘-remove_padding’ is called  ‘-remove_extra’ in older versions of LAStools.

las2las -i martin/*.laz ^
        -remove_vlrs_from_to 1 3 ^
        -remove_padding ^
        -set_version 1.4 ^
        -odir martin_LAS14_pt1 -olaz ^
        -cores 4

(2) Create LAS 1.4 files with point type 6

The las2las command shown below turns a folder of LAS 1.2 files into a folder of  LAS 1.4 files with option ‘-set_version 1.4’ and changes the point type from 1 to 6 with option ‘-set_point_type 6’. As before some VLRs and the header padding is removed. It also adds a second description of the georeferencing information by rewriting the existing information stored as GeoTIFF tags into a properly formatted OGC WKT string with option ‘-set_ogc_wkt’. Thanks to ESRI’s assault on the LAZ format the output still needs to be in LAS. It can be compressed with laszip in a subsequent step using the ‘LAS 1.4 compatibility mode‘ sponsored by NOAA.

las2las -i martin/*.laz ^
        -remove_vlrs_from_to 1 3 ^
        -remove_padding ^
        -set_version 1.4 ^
        -set_point_type 6 ^
        -set_ogc_wkt ^
        -odir martin_LAS14_pt6 -olas ^
        -cores 4

laszip -i martin_LAS14_pt6/*.las -cores 4

Below the output of lasinfo for the “upgraded” LAS 1.4 file with the OGC WKT string. Some of the key changes have been marked in red.

D:\LAStools\bin>lasinfo -i martin_LAS14_pt6\5126-05-42.las
reporting all LAS header entries:
 file signature: 'LASF'
 file source ID: 0
 global_encoding: 16
 project ID GUID data 1-4: E603549E-B74A-4696-3BAD-EA49F643C221
 version major.minor: 1.4
 system identifier: 'LAStools (c) by rapidlasso GmbH'
 generating software: 'las2las (version 160110)'
 file creation day/year: 119/2011
 header size: 375
 offset to point data: 1135
 number var. length records: 2
 point data format: 6
 point data record length: 30
 number of point records: 0
 number of points by return: 0 0 0 0 0
 scale factor x y z: 0.01 0.01 0.01
 offset x y z: 0 0 0
 min x y z: 361818.33 4855898.31 349.89
 max x y z: 364400.98 4859420.79 404.22
 start of waveform data packet record: 0
 start of first extended variable length record: 0
 number of extended_variable length records: 0
 extended number of point records: 13772409
 extended number of points by return: 13440460 240598 78743 12608 0 0 0 0 0 0 0 0 0 0 0
variable length header record 1 of 2:
 reserved 43707
 user ID 'LASF_Projection'
 record ID 34735
 length after header 40
 description 'by LAStools of Martin Isenburg'
 GeoKeyDirectoryTag version 1.1.0 number of keys 4
 key 1024 tiff_tag_location 0 count 1 value_offset 1 - GTModelTypeGeoKey: ModelTypeProjected
 key 3072 tiff_tag_location 0 count 1 value_offset 26915 - ProjectedCSTypeGeoKey: NAD83 / UTM 15N
 key 3076 tiff_tag_location 0 count 1 value_offset 9001 - ProjLinearUnitsGeoKey: Linear_Meter
 key 4099 tiff_tag_location 0 count 1 value_offset 9001 - VerticalUnitsGeoKey: Linear_Meter
variable length header record 2 of 2:
 reserved 43707
 user ID 'LASF_Projection'
 record ID 2112
 length after header 612
 description 'by LAStools of rapidlasso GmbH'
 WKT OGC COORDINATE SYSTEM:
 PROJCS["NAD83 / UTM 15N",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4269"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-93],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","26915"]]
reporting minimum and maximum for all LAS point record entries ...
 X 36181833 36440098
 Y 485589831 485942079
 Z 34989 40422
 intensity 1 4780
 return_number 1 4
 number_of_returns 1 4
 edge_of_flight_line 0 0
 scan_direction_flag 0 1
 classification 1 10
 scan_angle_rank -24 24
 user_data 0 32
 point_source_ID 5401 8015
 gps_time 243852.663596 403514.323142
 extended_return_number 1 4
 extended_number_of_returns 1 4
 extended_classification 1 10
 extended_scan_angle -4000 4000
 extended_scanner_channel 0 0
number of first returns: 13440460
number of intermediate returns: 91408
number of last returns: 13440349
number of single returns: 13199808
overview over extended number of returns of given pulse: 13199808 323647 198449 50505 0 0 0 0 0 0 0 0 0 0 0
histogram of classification of points:
 6560507 unclassified (1)
 6744481 ground (2)
 401464 medium vegetation (4)
 55433 building (6)
 100 noise (7)
 10157 water (9)
 267 rail (10)