Removing Low Noise from RIEGL’s VUX-1 UAV LiDAR flown in the Philippines

In this tutorial we are removing some “tricky” low noise from LiDAR point clouds in order to produce a high-resolution Digital Terrain Model (DTM). The data was flown above a tropical beach and mangrove area in the Philippines using a VUX-1 UAV based system from RIEGL mounted on a helicopter. The survey was done as a test flight by AB Surveying who have the capacity to fly such missions in the Philippines and who have allowed us to share this data with you for educational purposes. You can download the data (1 GB) here. It covers a popular twin beach knows as “Nacpan” near El Nido in Palawan (that we happen to have visited in 2014).

A typical beach fringed by coconut palms in Palawan, Philippines.

We start our usual quality check with a run of lasinfo. We add the ‘-cd’ switch to compute an average point density and the ‘-histo gps_time 1’ switch to produce a 1 second histogram for the GPS time stamps.

lasinfo -i lalutaya.laz ^
        -cd ^
        -histo gps_time 1 ^
        -odix _info -otxt

You can download the resulting lasinfo report here. It tells us that there are 118,740,310 points of type 3 (with RGB colors) with an average density of 57 last returns per square meter. The point coordinates are in the “PRS92 / Philippines 1” projection with EPSG code 3121 that is based on the “Clarke 1866” ellipsoid.

Datum Transform

We prefer to work in an UTM projection based on the “WGS 1984” ellipsoid, so we will first perform a datum transform based on the seven parameter Helmert transformation – a capacity that was recently added to LAStools. For this we first need a transform to get to geocentric or Earth-Centered Earth-Fixed (ECEF) coordinates on the current “Clarke 1866” ellipsoid, then we apply the Helmert transformation that operates on geocentric coordinates and whose parameters are listed in the TOWGS84 string of EPSG code 3121 to get to geocentric or ECEF coordinates on the “WGS 1984” ellipsoid. Finally we can convert the coordinates to the respective UTM zone. These three calls to las2las accomplish this.

las2las -i lalutaya.laz ^
        -remove_all_vlrs ^
        -epsg 3121 ^
        -target_ecef ^
        -odix _ecef_clark1866 -olaz

las2las -i lalutaya_ecef_clark1866.laz ^
        -transform_helmert -127.62,-67.24,-47.04,-3.068,4.903,1.578,-1.06 ^
        -wgs84 -ecef ^
        -ocut 10 -odix _wgs84 -olaz
las2las -i lalutaya_ecef_wgs84.laz ^
        -target_utm auto ^
        -ocut 11 -odix _utm -olaz

In these steps we implicitly reduced the resolution that each coordinate was stored with from quarter-millimeters (i.e. scale factors of 0.00025) to the default of centimeters (i.e. scale factors of 0.01) which should be sufficient for subsequent vegetation analysis. The LAZ files also compress better when coordinates a lower resolution so that the ‘lalutaya_utm.laz’ file is over 200 MB smaller than the original ‘lalutaya.laz’ file. The quantization error this introduces is probably still below the overall scanning error of this helicopter survey.

Flightline Recovery

Playing back the file visually with lasview suggests that it contains more than one flightline. Unfortunately the point source ID field of the file is not properly populated with flightline information. However, when scrutinizing the GPS time stamp histogram in the lasinfo report we can see an occasional gap. We highlight two of these gaps in red between GPS second 540226 and 540272 and GPS second 540497 and 540556 in this excerpt from the lasinfo report:

gps_time histogram with bin size 1
 bin 540224 has 125768
 bin 540225 has 116372
 bin 540226 has 2707
 bin 540272 has 159429
 bin 540273 has 272049
 bin 540274 has 280237
 bin 540495 has 187103
 bin 540496 has 180421
 bin 540497 has 126835
 bin 540556 has 228503
 bin 540557 has 275025
 bin 540558 has 273861

We can use lasplit to recover the original flightlines based on gaps in the continuity of GPS time stamps that are bigger than 10 seconds:

lassplit -i lalutaya_utm.laz ^
         -recover_flightlines_interval 10 ^
         -odir strips_raw -o lalutaya.laz

This operation splits the points into 11 separate flightlines. The points within each flightline are stored in the order that the vendor software – which was RiPROCESS 1.7.2 from RIEGL according to the lasinfo report – had written them to file. We can use lassort to bring them back into the order they were acquired in by sorting first on the GPS time stamp and then on the return number field:

lassort -i strips_raw\*.laz ^
        -gps_time -return_number ^
        -odir strips_sorted -olaz ^
        -cores 4

Now we turn the sorted flightlines into tiles (with buffers !!!) for further processing. We also erase the current classification of the data into ground (2) and medium vegetation (4) as a quick visual inspection with lasview immediately shows that those are not correct:

lastile -i strips_sorted\*.laz ^
        -files_are_flightlines ^
        -set_classification 0 ^
        -tile_size 250 -buffer 30 -flag_as_withheld ^
        -odir tiles_raw -o lalu.laz

Quality Checking

Next comes the standard check of flightline overlap and alignment check with lasoverlap. Once more it become clear why it is so important to have flightline information. Without we may have missed what we are about to notice. We create false color images load into Google Earth to visually assess the situation. We map all absolute differences between flightlines below 5 cm to white and all absolute differences above 30 cm to saturated red (positive) or blue (negative) with a gradual shading from white to red or blue for any differences in between. We also create an overview KML file that lets us quickly see in which tile we can find the points for a particular area of interest with lasboundary.

lasoverlap -i tiles_raw\*.laz ^
           -step 1 -min_diff 0.05 -max_diff 0.30 ^
           -odir quality -opng ^
           -cores 4

lasboundary -i tiles_raw\*.laz ^
            -use_tile_bb -overview -labels ^
            -o quality\overview.kml

The resulting visualizations show (a) that our datum transform to the WGS84 ellipsoid worked because the imagery aligns nicely with Google Earth and (b) that there are several issues in the data that require further scrutiny.

In general the data seems well aligned (most open areas are white) but there are blue and red lines crossing the survey area. With lasview have a closer look at the visible blue lines running along the beach in tile ‘lalu_765000_1252750.laz’ by repeatedly pressing ‘x’ to select a different subset and ‘x’ again to view this subset up close while pressing ‘c’ to color it differently:

lasview -i tiles_raw\lalu_765000_1252750.laz

These lines of erroneous points do not only happen along the beach but also in the middle of and below the vegetation as can be seen below:

Our initial hope was to use the higher than usual intensity of these erroneous points to reclassify them to some classification code that we would them exclude from further processing. Visually we found that a reasonable cut-off value for this tile would be an intensity above 35000:

lasview -i tiles_raw\lalu_765000_1252750.laz ^
        -keep_intensity_above 35000 ^
        -filtered_transform ^
        -set_classification 23

However, while this method seems successful on the tile shown above it fails miserably on others such as ‘lalu_764250_1251500.laz’ where large parts of the beach are very reflective and result in high intensity returns to to the dry white sand:

lasview -i tiles_raw\lalu_764250_1251500.laz ^
        -keep_intensity_above 35000 ^
        -filtered_transform ^
        -set_classification 23

Low Noise Removal

In the following we describe a workflow that can remove the erroneous points below the ground so that we can at least construct a high-quality DTM from the data. This will not, however, remove the erroneous points above the ground so a subsequent vegetation analysis would still be affected. Our approach is based on two obervations (a) the erroneous points affect only a relatively small area and (b) different flightlines have their erroneous points in different areas. The idea is to compute a set of coarser ground points separately for each flightline and – when combining them in the end – to pick higher ground points over lower ones. The combined points should then define a surface that is above the erroneous below-ground points so that we can mark them with lasheight as not to be used for the actual ground classification done thereafter.

The new huge_las_file_extract_ground_points_only.bat example batch script that you can download here does all the work needed to compute a set of coarser ground points for each flightline. Simply edit the file such that the LAStools variable points to your LAStools\bin folder and rename it to end with the *.bat extension. Then run:

huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000001.laz strips_ground_only\lalutaya_0000001.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000002.laz strips_ground_only\lalutaya_0000001.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000003.laz strips_ground_only\lalutaya_0000001.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000009.laz strips_ground_only\lalutaya_0000009.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000010.laz strips_ground_only\lalutaya_0000010.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000011.laz strips_ground_only\lalutaya_0000011.laz

The details on how this batch script works – a pretty standard tile-based multi-core processing workflow – are given as comments in this batch script. Now we have a set of individual ground points computed separately for each flightline and some will include erroneous points below the ground that the lasground algorithm by its very nature is likely to latch on to as you can see here:

The trick is now to utilize the redundancy of multiple scans per area and – when combining flightlines – to pick higher rather than lower ground points in overlap areas by using the ground point closest to the 75th elevation percentile per 2 meter by 2 meter area with at least 3 or more points with lasthin:

lasthin -i strips_ground_only\*.laz -merged ^
        -step 2 -percentile 75 3 ^
        -o lalutaya_ground_only_2m_75_3.laz

There are still some non-ground points in the result as ground-classifying of flightlines individually often results in vegetation returns being included in sparse areas along the edges of the flight lines but we can easily get rid of those:

lasground_new -i lalutaya__ground_only_2m_75_3.laz ^
              -town -hyper_fine ^
              -odix _g -olaz

We sort the remaining ground points into a space-filling curve order with lassort and spatially index them with lasindex so they can be efficiently accessed by lasheight in the next step.

lassort -i lalutaya__ground_only_2m_75_3_g.laz ^
        -keep_class 2 ^
        -o lalutaya_ground.laz

lasindex -i lalutaya_ground.laz

Finally we have the means to robustly remove the erroneous points below the ground from all tiles. We use lasheight with the ground points we’ve just so painstakingly computed to classify all points 20 cm or more below the ground surface they define into classification code 23. Later we simply can ignore this classification code during processing:

lasheight -i tiles_raw\*.laz ^
          -ground_points lalutaya_ground.laz ^
          -do_not_store_in_user_data ^
          -classify_below -0.2 23 ^
          -odir tiles_cleaned -olaz ^
          -cores 4

Rather than trying to ground classify all remaining points we run lasground on a thinned subset of all points. For this we mark the lowest point in every 20 cm by 20 cm grid cell with some temporary classification code such as 6.

lasthin -i tiles_cleaned\*.laz ^
        -ignore_class 23 ^
        -step 0.20 -lowest -classify_as 6 ^
        -odir tiles_thinned -olaz ^
        -cores 4

Finally we can run lasground to compute the ground classification considering all points with classification code 6 by ignoring all points with classification codes 23 and 0.

lasground_new -i tiles_thinned\*.laz ^
              -ignore_class 23 0 ^
              -city -hyper_fine ^
              -odir tiles_ground_new -olaz ^
              -cores 4

And finally we can create a DTM with a resolution of 25 cm using las2dem and the result is truly beautiful:

las2dem -i tiles_ground_new\*.laz ^
        -keep_class 2 ^
        -step 0.25 -use_tile_bb ^
        -odir tiles_dtm_25cm -obil ^
        -cores 4

We have to admit that a few bumps are left (see mouse cursor below) but adjusting the parameters presented here is left as an exercise to the reader.

We would again like to acknowledge AB Surveying whose generosity has made this blog article possible. They have the capacity to fly such missions in the Philippines and who have allowed us to share this data with you for educational purposes.

Removing Excessive Low Noise from Dense-Matching Point Clouds

Point clouds produced with dense-matching by photogrammetry software such as SURE, Pix4D, or Photoscan can include a fair amount of the kind of “low noise” as seen below. Low noise causes trouble when attempting to construct a Digital Terrain Model (DTM) from the points as common algorithm for classifying points into ground and non-ground points – such as lasground – tend to “latch onto” those low points, thereby producing a poor representation of the terrain. This blog post describes one possible LAStools workflow for eliminating excessive low noise. It was developed after a question in the LAStools user forum by LASmoons holder Muriel Lavy who was able to share her noisy data with us. See this, this, this, thisthis, and this blog post for further reading on this topic.

Here you can download the dense matching point cloud that we are using in the following work flow:

We leave the usual inspection of the content with lasinfolasview, and lasvalidate that we always recommend on newly obtained data as an exercise to the reader. Note that a check for proper alignment of flightlines with lasoverlap that we consider mandatory for LiDAR data is not applicable for dense-matching points.

With lastile we turn the original file with 87,261,083 points into many smaller 500 by 500 meter tiles for efficient multi-core processing. Each tile is given a 25 meter buffer to avoid edge artifacts. The buffer points are marked as withheld for easier on-the-fly removal. We add a (terser) description of the WGS84 UTM zone 32N to each tile via the corresponding EPSG code 32632:
lastile -i muriel\20161127_Pancalieri_UTM.laz ^
        -tile_size 500 -buffer 25 -flag_as_withheld ^
        -epsg 32632 ^
        -odir muriel\tiles_raw -o panca.laz
Because dense-matching points often have a poor point order in the files they get delivered in we use lassort to rearrange them into a space-filling curve order as this will speed up most following processing steps:
lassort -i muriel\tiles_raw\panca*.laz ^
        -odir muriel\tiles_sorted -olaz ^
        -cores 7
We then run lasthin to reclassify the highest point of every 2.5 by 2.5 meter grid cell with classification code 8. As the spacing of the dense-matched points is around 40 cm in both x and y, around 40 points will fall into each such grid cell from which the highest is then classified as 8:
lasthin -i muriel\tiles_sorted\panca*.laz ^
        -step 2.5 ^
        -highest -classify_as 8 ^
        -odir muriel\tiles_thinned -olaz ^
        -cores 7
Considering only those points classified as 8 in the last step we then run lasnoise to find points that are highly isolated in wide and flat neighborhoods that are then reclassified as 7. See the README file of lasnoise for a detailed explanation of the different parameters:
lasnoise -i muriel\tiles_thinned\panca*.laz ^
         -ignore_class 0 ^
         -step_xy 5 -step_z 0.1 -isolated 4 ^
         -classify_as 7 ^
         -odir muriel\tiles_isolated -olaz ^
         -cores 7
Now we run a temporary ground classification of only (!!!) on those points that are still classified as 8 using the default parameters of lasground. Hence we only use the points that were the highest points on the 2.5 by 2.5 meter grid and that were not classified as noise in the previous step. See the README file of lasground for a detailed explanation of the different parameters:
lasground -i muriel\tiles_isolated\panca*.laz ^
          -city -ultra_fine -ignore_class 0 7 ^
          -odir muriel\tiles_temp_ground -olaz ^
          -cores 7
The result of this temporary ground filtering is then merely used to mark all points that are 0.5 meter below the triangulated TIN of these temporary ground points with classification code 12 using lasheight. See the README file of lasheight for a detailed explanation of the different parameters:
lasheight -i muriel\tiles_temp_ground\panca*.laz ^
          -do_not_store_in_user_data ^
          -classify_below -0.5 12 ^
          -odir muriel\tiles_temp_denoised -olaz ^
          -cores 7
In the resulting tiles the low noise (but also many points above the ground) are now marked and in a final step we produce properly classified denoised tiles by re-mapping the temporary classification codes to conventions that are more consistent with the ASPRS LAS specification using las2las:
las2las -i muriel\tiles_temp_denoised\panca*.laz ^
        -change_classification_from_to 1 0 ^
        -change_classification_from_to 2 0 ^
        -change_classification_from_to 7 0 ^
        -change_classification_from_to 12 7 ^
        -odir muriel\tiles_denoised -olaz ^
        -cores 7
Let us visually check what each of the above steps has produced by zooming in on a 300 meter by 100 meter strip of points with the bounding box (388500,4963125) to (388800,4963225) in tile ‘panca_388500_4963000.laz’:
The final classification of all points that are not already classified as noise (7) into ground (2) or non-ground (1) was done with a final run of lasground. See the README file of lasground for a detailed explanation of the different parameters:
lasground -i muriel\tiles_denoised\panca*.laz ^
          -ignore_class 7 ^
          -city -ultra_fine ^
          -odir muriel\tiles_ground -olaz ^
          -cores 7
Then we create a seamless hill-shaded DTM tiles by triangulating all the points classified as ground into a temporary TIN (including those in the 25 meter buffer) and then rasterizing only the inner 500 meter by 500 meter of each tile with option ‘-use_tile_bb’ of las2dem. For more details on the importance of buffers in tile-based processing see this blog post here.
las2dem -i muriel\tiles_ground\panca*.laz ^
        -keep_class 2 ^
        -step 1 -hillshade ^
        -use_tile_bb ^
        -odir muriel\tiles_dtm -opng ^
        -cores 7

And here the original DSM side-by-side with resulting DTM after low noise removal. One dense forested area near the center of the data was not entirely removed due to the lack of ground points in this area. Integrating external ground points or manual editing with lasview are two possible way to rectify these few remaining errors …