Complete LiDAR Processing Pipeline: from raw Flightlines to final Products

This tutorial serves as an example for a complete end-to-end workflow that starts with raw LiDAR flightlines (as they may be delivered by a vendor) to final classified LiDAR tiles and derived products such as raster DTM, DSM, and SHP files with contours, building footprint and vegetation layers. The three example flightlines we are using here were flown in Ayutthaya, Thailand with a RIEGL LMS Q680i LiDAR scanner by Asian Aerospace Services who are based at the Don Mueang airport in Bangkok from where they are serving South-East-Asia and beyond. You can download them here:

Quality Checking

The minimal quality checks consist of generating textual reports (lasinfo & lasvalidate), inspecting the data visually (lasview), making sure alignment and overlap between flightlines fulfill expectations (lasoverlap), and measuring pulse density per square meter (lasgrid). Additional checks for points replication (lasduplicate), completeness of all returns per pulse (lasreturn), and validation against external ground control points (lascontrol) may also be performed.

lasinfo -i Ayutthaya\strips_raw\*.laz ^
        -cd ^
        -histo z 5 ^
        -histo intensity 64 ^
        -odir Ayutthaya\quality -odix _info -otxt ^
        -cores 3

lasvalidate -i Ayutthaya\strips_raw\*.laz ^
            -o Ayutthaya\quality\validate.xml

The lasinfo report generated with the command line shown computes the average density for each flightline and also generates two histograms, one for the z coordinate with a bin size of 5 meter and one for the intensity with a bin size of 64. The resulting textual descriptions are output into the specified quality folder with an appendix ‘_info’ added to the original file name. Perusing these reports tells you that there are up to 7 returns per pulse, that the average pulse density per flightline is between 7.1 to 7.9 shots per square meter, that the point source IDs of the points are already populated correctly, that there are isolated points far above and far below the scanned area, and that the intensity values range from 0 to 1023 with the majority being below 400. The warnings in the lasinfo and the lasvalidate reports about the presence of return numbers 6 and 7 have to do with the history of the LAS format and can safely be ignored.

lasoverlap -i Ayutthaya\strips_raw\*.laz ^
           -files_are_flightlines ^
           -min_diff 0.1 -max_diff 0.3 ^
           -odir Ayutthaya\quality -o overlap.png

This results in two color illustrations. One image shows the flightline overlap with blue indicating one flightline, turquoise indicating two, and yellow indicating three flightlines. Note that wet areas (rivers, lakes, rice paddies, …) without LiDAR returns affect this visualization. The other image shows how well overlapping flightlines align. Their vertical difference is color coded with while meaning less than 10 cm error while saturated red and blue indicate areas with more than 30 cm positive or negative difference.

One pixel wide red and blue along building edges and speckles of red and blue in vegetated areas are normal. We need to look-out for large systematic errors where terrain features or flightline outlines become visible. If you click yourself through this photo album you will eventually see typical examples (make sure to read the comments too). One area slightly below the center looks suspicious. We load the PNG into the GUI to pick this area for closer inspection with lasview.

lasview -i Ayutthaya\strips_raw\*.laz -gui

Why these flightline differences exist and whether they are detrimental to your purpose are questions that you will have to explore further. For out purpose this isolated difference was noted but will not prevent us from proceeding further. Next we want to investigate the pulse density. We do this with lasgrid. We know that the average last return density per flightline is between 7.1 to 7.9 shots per square meter. We set up the false color map for lasgrid such that it is blue when the last return density drops to 5 shots (or less) per square meter and such that it is red when the last return density reaches 10 shots (or more).

lasgrid -i Ayutthaya\strips_raw\*.laz -merged ^
        -keep_last ^
        -step 2 -density ^
        -false -set_min_max 4 8 ^
        -odir Ayutthaya\quality -o density_4ppm_8ppm.png

The last return density per square meter mapped to a color: blue is 5 or less, red is 10 or more.

The last return density image clearly shows how the density increases to over 10 pulses per square meter in all areas of flightline overlap. However, as there are large parts covered by only one flightline their density is the one that should be considered. We note great variations in density along the flightlines. Those have to do with aircraft movement and in particular with the pitch. When the nose of the plane goes up even slightly, the gigantic “fan of laser pulses” (that can be thought of as being rigidly attached at the bottom perpendicular to the aircraft flight direction) is moving faster forward on the ground far below and therefore covers it with fewer shots per square meter. Conversely when the nose of the plane goes down the spacing between scan lines far below the plane are condensed so that the density increases. Inclement weather and the resulting airplane pitch turbulence can have a big impact on how regular the laser pulse spacing is on the ground. Read this article for more on LiDAR pulse density and spacing.

LiDAR Preparation

When you have airborne LiDAR in flightlines the first step is to tile the data into square tiles that are typically 1000 by 1000 or – for higher density surveys – 500 by 500 meters in size. This can be done with lastile. We also add a buffer of 30 meters to each tile. Why buffers are important for tile-based processing is explained here. We choose 30 meters as this is larger than any building we expect in this area and slightly larger than the ‘-step’ size we use later when classifying the points into ground and non-ground points with lasground.

lastile -i Ayutthaya\strips_raw\*.laz ^
        -tile_size 500 -buffer 30 -flag_as_withheld ^
        -odir Ayutthaya\tiles_raw -o ayu.laz

NOTE: Usually you will have to add ‘-files_are_flightlines’ or ‘-apply_file_source_ID’ to the lastile command shown above in order to preserve the information which points is from which flightline. We do not have to do this here as evident from the lasinfo reports we generated earlier. Not only is the file source ID in the LAS header is correctly set to 2, 3, or 4 reflecting the intended flightline numbering evident from the file names. Also the point source ID of each point is already set to the correct value 2, 3, or 4. For more info see this or this discussion from the LAStools user forum.

Next we classify isolated points that are far from most other points with lasnoise into the (default) classification code 7. See the README file for the meaning of the parameters and play around with different setting to get a feel for how to make this process more or less aggressive.

lasnoise -i Ayutthaya\tiles_raw\ayu*.laz ^
         -step_xy 4 -step_z 2 -isolated 5 ^
         -odir Ayutthaya\tiles_denoised -olaz ^
         -cores 4

Especially for ground classification it is important that low noise points are excluded. You should inspect a number of tiles with lasview to make sure the low noise are all pink now if you color them by classification.

lasview -i Ayutthaya\tiles_denoised\ayu*.laz -gui

While the algorithms in lasground are designed to withstand a few noise points below the ground, you will find that it will include them into the ground model if there are too many of them. Hence, it is important to tell lasground to ignore these noise points. For the other parameters used see the README file of lasground.

lasground -i Ayutthaya\tiles_denoised\ayu*.laz ^
          -ignore_class 7 ^
          -city -ultra_fine ^
          -compute_height ^
          -odir Ayutthaya\tiles_ground -olaz ^
          -cores 4

You should visually check the resulting ground classification of each tile with lasview by selecting smaller subsets (press ‘x’, draw a rectangle, press ‘x’ again, use arrow keys to walk) and then switch back and forth between a triangulation of the ground points (press ‘g’ and then press ‘t’) and a triangulation of last returns (press ‘l’ and then press ‘t’). See the README of lasview for more information on those hotkeys.

lasview -i Ayutthaya\tiles_ground\ayu*.laz -gui

This way I found at least one tile that should be reclassified with ‘-metro’ instead of ‘-city’ because it still contained one large building in the ground classification. Alternatively you can correct miss-classifications manually using lasview as shown in the next few screen shots.

This is an optional step for advanced users who have a license. In case you managed to do such a manual edit and saved it as a LAY file using LASlayers (see the README file of lasview) you can overwrite the old file with:

laslayers -i Ayutthaya\tiles_ground\ayu_669500_1586500.laz -ilay ^
          -o Ayutthaya\tiles_ground\ayu_669500_1586500_edited.laz

move Ayutthaya\tiles_ground\ayu_669500_1586500_edit.laz ^
     Ayutthaya\tiles_ground\ayu_669500_1586500.laz

The next step classifies those points that are neither ground (2) nor noise (7) into building (or rather roof) points (class 6) and high vegetation points (class 5). For this we use lasclassify with the default parameters that only considers points that are at least 2 meters above the classified ground points (see the README for details on all available parameters).

lasclassify -i Ayutthaya\tiles_ground\ayu*.laz ^
            -ignore_class 7 ^
            -odir Ayutthaya\tiles_classified -olaz ^
            -cores 4

We  check the classification of each tile with lasview by selecting smaller subsets (press ‘x’, draw a rectangle, press ‘x’ again) and by traversing with the arrow keys though the point cloud. You will find a number of miss-classifications. Boats are classified as buildings, water towers or complex temple roofs as vegetation, … and so on. You could use lasview to manually correct any classifications that are really important.

lasview -i Ayutthaya\tiles_classified\ayu*.laz -gui

Before delivering the classified LiDAR tiles to a customer or another user it is imperative to remove the buffers that were carried through all computations to avoid artifacts along the tile boundary. This can also be done with lastile.

lastile -i Ayutthaya\tiles_classified\ayu*.laz ^
        -remove_buffer ^
        -odir Ayutthaya\tiles_final -olaz ^
        -cores 4

Together with the tiling you may want to deliver a tile overview file in KML format (or in SHP format) that you can easily generate with lasboundary using this command line:

lasboundary -i Ayutthaya\tiles_final\ayu*.laz ^
            -use_bb ^
            -overview -labels ^
            -o Ayutthaya\tiles_overview.kml

The small KML file generated b lasboundary provides a quick overview where tiles are located, their names, their bounding box, and the number of points they contain.

Derivative production

The most common derivative product produced from LiDAR data is a Digital Terrain Model (DTM) in form of an elevation raster. This can be obtained by interpolating the ground points with a triangulation (i.e. a Delaunay TIN) and by sampling the TIN at the center of each raster cell. The pulse density of well over 4 shots per square meter definitely supports a resolution of 0.5 meter for the raster DTM. From the ground-classified tiles with buffer we compute the DTM using las2dem as follows:

las2dem -i Ayutthaya\tiles_ground\ayu*.laz ^
        -keep_class 2 ^
        -step 0.5 -use_tile_bb ^
        -odir Ayutthaya\tiles_dtm -obil ^
        -cores 4

It’s important to add ‘-use_tile_bb’ to the command line which limits the raster generation to the original tile sizes of 500 by 500 meters in order not to rasterize the buffers that are extending the tiles 30 meters in each direction. We used the BIL format so that we inspect the resulting elevation rasters with lasview:

lasview -i Ayutthaya\tiles_dtm\ayu*.bil -gui

To create a hillshaded version of the DTM you can use your favorite raster processing package such as GDAL or GRASS but you could also use the BLAST extension of LAStools and create a large seamless image with blast2dem as follows:

blast2dem -i Ayutthaya\tiles_dtm\ayu*.bil -merged ^
          -step 0.5 -hillshade -epsg 32647 ^
          -o Ayutthaya\dtm_hillshade.png

Because blast2dem does not parse the PRJ files that accompany the BIL rasters we have to specify the EPSG code explicitly to also get a KML file that allows us to visualize the LiDAR in Google Earth.

A a hillshading of the merged DTM rasters produced with blast2dem.

Next we generate a Digital Surface Model (DSM) that includes the highest objects that the laser has hit. We use the spike-free algorithm that is implemented in las2dem that creates a triangulation of the highest returns as follows:

las2dem -i Ayutthaya\tiles_denoised\ayu*.laz ^
        -drop_class 7 ^
        -step 0.5 -spike_free 1.2 -use_tile_bb ^
        -odir Ayutthaya\tiles_dsm -obil ^
        -cores 4

We used 1.0 as the freeze value for the spike free algorithm because this is about three times the average last return spacing reported in the individual lasinfo reports generated during quality checking. Again we inspect the resulting rasters with lasview:

lasview -i Ayutthaya\tiles_dsm\ayu*.bil -gui

For reason of comparison we also generate the DSM rasters using a simple first-return interpolation again with las2dem as follows:

las2dem -i Ayutthaya\tiles_denoised\ayu*.laz ^
        -drop_class 7 -keep_first ^
        -step 0.5 -use_tile_bb ^
        -odir Ayutthaya\tiles_dsm -obil ^
        -cores 4

A few direct side-by-side comparison between a spike-free DSM and a first-return DSM shows the difference that are especially noticeable along building edges and in large trees.

Another product that we can easily create are building footprints from the automatically classified roofs using lasboundary. Because the tool is quite scalable we can simply on-the-fly merge the final tiles. This also avoids including duplicate points from the tile buffer whose classifications are also often less accurate.

lasboundary -i Ayutthaya\tiles_final\ayu*.laz -merged ^
            -keep_class 6 ^
            -disjoint -concavity 1.1 ^
            -o Ayutthaya\buildings.shp

Similarly we can use lasboundary to create a vegetation layer from those points that were automatically classified as high vegetation.

lasboundary -i Ayutthaya\tiles_final\ayu*.laz -merged ^
             -keep_class 5 ^
             -disjoint -concavity 3 ^
             -o Ayutthaya\vegetation.shp

We can also produce 1.0 meter contour lines from the ground classified points. However, for nicer contours it is beneficial to first generate a subset of the ground points with lasthin using option ‘-contours 1.0’ as follows:

lasthin -i Ayutthaya\tiles_final\ayu*.laz ^
        -keep_class 2 ^
        -step 1.0 -contours 1.0 ^
        -odir Ayutthaya\tiles_temp -olaz ^
        -cores 4

We then merge all subsets of ground points from those temporary tiles on-the-fly into one (using the ‘-merged’ option) and let blast2iso from the BLAST extension of LAStools generate smoothed and simplified 1 meter contours as follows:

blast2iso -i Ayutthaya\tiles_temp\ayu*.laz -merged ^
          -iso_every 1.0 ^
          -smooth 2 -simplify_length 0.5 -simplify_area 0.5 -clean 5.0 ^
          -o Ayutthaya\contours_1m.shp

Finally we composite all of our derived LiDAR products into one map using QGIS and then fuse it with data from OpenStreetMap that we’ve downloaded from BBBike. Here you can download the OSM data that we use.

It’s in particular interesting to compare the building footprints that were automatically derived from our LiDAR processing pipeline with those mapped by OpenStreetMap volunteers. We immediately see that there is a lot of volunteering work left to do and the LiDAR-derived data can assist us in directing those mapping efforts. A closer look reveals the (expected) quality difference between hand-mapped and auto-generated data.

The OSM buildings are simpler. These polygons are drawn and divided into logical units by a human. They are individually verified and correspond to actual buildings. However, they are less aligned with the Google Earth satellite image. The LiDAR-derived buildings footprints are complex because lasboundary has no logic to simplify the complicated polygonal chains that enclose the points that were automatically classified as roof into rectilinear shapes or to break directly adjacent roof points into separate logical units. However, most buildings are found (but also objects that are not buildings) and their geospatial alignment is as good as that of the LiDAR data.

Three Videos from Full Day Workshop on LiDAR at IIST Trivandrum in India

Three videos from a full day workshop on LiDAR processing at the Department of Earth and Space Sciences of the Indian Institute of Space Science and Technology (IIST) in Thiruvananthapuram, Kerala, India held in October 2017 that was organized by Dr. A. M. Ramiya who we thank very much for the invitation and for being such a kind host. After our usual introduction to LiDAR and LAStools we use three raw flightlines from Ayutthaya, Thailand as example data to perform a complete LiDAR workflow including

  1. LiDAR quality checking such as pulse density, coverage, and flightline alignment
  2. LiDAR preparation (compressing, tiling, denoising, classifying)
  3. LiDAR derivative creation (DTM and DSM rasters, contours, building and vegetation footprints)

You can download the three flightlines used in the tutorial here (line2, line3, line4) to follow along.

morning video

after lunch video

after tea video

Scotland’s LiDAR goes Open Data (too)

Following the lead of England and Wales, the Scottish LiDAR is now also open data. The implementation of such an open geospatial policy in the United Kingdom was spear-headed by the Environment Agency of England who started to make all of their LiDAR holdings available as open data. In September 2015 they opened DTM and DSM raster derivatives down to 25 cm resolution and in March 2016 also the raw point clouds went online our compressed and open LAZ format (more info here) – all with the very permissible Open Government Licence v3. This treasure cove of geospatial data was collected by the Environment Agency Geomatics own survey aircraft mainly for flood mapping purposes. The data that had been access restricted for the past 17 years of operation and was made open only after it was shown that restricting access in order to recover costs to finance future operations – a common argument for withholding tax-payer funded data – was nothing but an utter myth. This open data policy has resulted in an incredible re-use of the LiDAR and the Environment Agency has literally been propelled into the role of a “champion for open data” inspiring Wales (possibly the German states of North-Rhine Westfalia and Thuringia) and now also Scotland to open up their geospatial archives as well …

Huge LAS files available for download from the Scottish Open Data portal.

We went to the nice online portal of Scotland to download three files from the Phase II LiDAR for Scotland that are provided as uncompressed LAS files, namely LAS_NN45NE.las, LAS_NN55NE.las, and LAS_NN55NW.las, whose sizes are listed as 1.2 GB, 2.8 GB, and 4.7 GB in the screenshot above. Needless to say that it took quite some time and several restarts (using wget with option ‘-c’) to successfully download these very large LAS files.

laszip -i LAS_NN45NE.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN45NE.las -odix _mm -olaz
laszip -i LAS_NN55NE.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN55NE.las -odix _mm -olaz
laszip -i LAS_NN55NW.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN55NW.las -odix _mm -olaz

After downloading we decided to see how well these files compress with LASzip by running the six commands shown above creating LAZ files when re-scaling of coordinate resolution to centimeter (cm) and LAZ files with the original millimeter (mm) coordinate resolution (i.e. the original scale factors are 0.001 which is somewhat excessive for aerial LiDAR where the error in position per coordinate is typically between 5 cm and 20 cm). Below you see the resulting file sizes for the three different files.

 1,164,141,247 LAS_NN45NE.las
   124,351,690 LAS_NN45NE_cm.laz (1 : 9.4)
   146,651,719 LAS_NN45NE_mm.laz (1 : 7.9)
 2,833,123,863 LAS_NN55NE.las
   396,521,115 LAS_NN55NE_cm.laz (1 : 7.1)
   474,767,495 LAS_NN55NE_mm.laz (1 : 6.0)
 4,664,782,671 LAS_NN55NW.las
   531,454,473 LAS_NN55NW_cm.laz (1 : 8.8)
   629,141,151 LAS_NN55NW_mm.laz (1 : 7.4)

The savings in download time and storage space of storing the LiDAR in LAZ versus LAS are sixfold to tenfold. If I was a tax payer in Scotland and if my government was hosting open data on in the Amazon cloud (i.e. paying for AWS cloud services with my taxes) I would encourage them to store their data in a more compressed format. Some more details on the data.

According to the provided meta data, the Scottish Public Sector LiDAR Phase II dataset was commissioned by the Scottish Government in response to the Flood Risk Management Act (2009). The project was managed by Sniffer and the contract was awarded to Fugro BKS. Airborne LiDAR data was collected for 66 sites (the dataset does not have full national coverage) totaling 3,516 km^2 between 29th November 2012 and 18th April 2014. The point density was a minimum of 1 point/sqm, and approximately 2 points/sqm on average. A DTM and DSM were produced from the point clouds, with 1m spatial resolution. The Coordinate reference system is OSGB 1936 / British National Grid (EPSG code 27700). The data is licensed under an Open Government Licence. However, under the use constraints section it now only states that the following attribution statement must be used to acknowledge the source of the information: “Copyright Scottish Government and SEPA (2014)” but also that Fugro retain the commercial copyright, which is somewhat disconcerting and may require more clarification. According to this tweet a lesser license (NCGL) applies to the raw LiDAR point clouds. Below a lasinfo report for the large LAS_NN55NW.las as well as several visualizations with lasview.

lasinfo (170915) report for LAS_NN55NW.las
reporting all LAS header entries:
 file signature: 'LASF'
 file source ID: 0
 global_encoding: 1
 project ID GUID data 1-4: 00000000-0000-0000-0000-000000000000
 version major.minor: 1.2
 system identifier: 'Riegl LMS-Q'
 generating software: 'Fugro LAS Processor'
 file creation day/year: 343/2016
 header size: 227
 offset to point data: 227
 number var. length records: 0
 point data format: 1
 point data record length: 28
 number of point records: 166599373
 number of points by return: 149685204 14102522 2531075 280572 0
 scale factor x y z: 0.001 0.001 0.001
 offset x y z: 250050 755050 270
 min x y z: 250000.000 755000.000 203.731
 max x y z: 254999.999 759999.999 491.901
reporting minimum and maximum for all LAS point record entries ...
 X -50000 4949999
 Y -50000 4949999
 Z -66269 221901
 intensity 39 2046
 return_number 1 4
 number_of_returns 1 4
 edge_of_flight_line 0 1
 scan_direction_flag 1 1
 classification 1 11
 scan_angle_rank -30 30
 user_data 0 3
 point_source_ID 66 91
 gps_time 38230669.389034 38402435.753789
number of first returns: 149685204
number of intermediate returns: 2813604
number of last returns: 149687616
number of single returns: 135599244
overview over number of returns of given pulse: 135599244 23122229 6754118 1123782 0 0 0
histogram of classification of points:
 287819 unclassified (1)
 109019874 ground (2)
 14476880 low vegetation (3)
 3487218 medium vegetation (4)
 39141518 high vegetation (5)
 165340 building (6)
 13508 rail (10)
 7216 road surface (11)

Kudos to the Scottish government for opening their data. We hereby acknowledge the source of the LiDAR that we have used in the experiments above as “Copyright Scottish Government and SEPA (2014)”.

LAStools Win Big at INTERGEO Taking Home Two Innovation Awards

PRESS RELEASE (for immediate release)
October 2, 2017
rapidlasso GmbH, Gilching, Germany

At INTERGEO 2017 in Berlin, rapidlasso GmbH – the makers of the popular LiDAR processing software LAStools – were awarded top honors in both of the categories they had been nominated for: most innovative software and most innovative startup. The third award for most innovative hardware went to Leica Geosystems for the BLK360 terrestrial scanner. The annual Wichman Innovation Awards have been part of INTERGEO for six years now. Already at the inaugural event in 2012 the open source LiDAR compressor LASzip of rapidlasso GmbH had been nominated, coming in as runner-up in second place.

Dr. Martin Isenburg, the founder and CEO of rapidlasso GmbH, receives the two innovation awards at the ceremony during INTERGEO 2017 in Berlin.

After receiving the two awards Dr. Martin Isenburg, the founder and CEO of rapidlasso GmbH, was quick to thank the “fun, active, and dedicated user community” of the LAStools software for their “incredible support in the online voting”. He pointed out that it was its users who make LAStools more than just an efficient software for processing point clouds. Since 2011, the community surrounding LAStools has constantly grown to several thousand users who help and motivate each other in designing workflows and in solving format issues and processing challenges. They are an integral part of what makes these tools so valuable, so Dr. Isenburg.

About rapidlasso GmbH:
Technology powerhouse rapidlasso GmbH specializes in efficient LiDAR processing tools that are widely known for their high productivity. They combine robust algorithms with efficient I/O and clever memory management to achieve high throughput for data sets containing billions of points. The company’s flagship product – the LAStools software suite – has deep market penetration and is heavily used in industry, government agencies, research labs, and educational institutions. Visit http://rapidlasso.com for more information.

LASmoons: Huaibo Mu

Huaibo Mu (recipient of three LASmoons)
Environmental Mapping, Department of Geography
University College London (UCL), UK

Background:
This study is a part of the EU-funded Metrology for Earth Observation and Climate project (MetEOC-2). It aims to combine terrestrial and airborne LiDAR data to estimate biomass and allometry for woodland trees in the UK. Airborne LiDAR can capture large amounts of data over large areas, while terrestrial LiDAR can provide much more details of high quality in specific areas. The biomass and allometry for individual specific tree species in 1 ha of Wytham Woods located about 5km north west of the University of Oxford, UK are estimated by combining both airborne and terrestrial LiDAR. Then the bias will be evaluated when estimation are applied on different levels: terrestrial LiDAR level, tree level, and area level. The goal are better insights and a controllable error range in the bias of biomass and allometry estimates for woodland trees based on airborne LiDAR.

Goal:
The study aims to find the suitable parameters of allometric equations for different specific species and establish the relationship between the tree height and stem diameter and crown diameter to be able to estimate the above ground biomass using airborne LiDAR. The biomass estimates under different levels are then compared to evaluate the bias and for the total 6ha of Wytham Woods for calibration and validation. Finally the results are to be applied to other woodlands in the UK. The LiDAR processing tasks for which LAStools are used mainly center around the creation of suitable a Canopy Height Model (CHM) from the airborne LiDAR.

Data:
+ Airborne LiDAR data produced by Professor David Coomes (University of Cambridge) with Airborne Research and Survey Facility (ARSF) Project code of RG13_08 in June 2014. The average point density is about 5.886 per m^2.
+ Terrestrial LiDAR data collected by UCL’s team leader by Dr. Mat Disney and Dr. Kim Calders in order to develop very detailed 3D models of the trees.
+ Fieldwork from the project “Initial Results from Establishment of a Long-term Broadleaf Monitoring Plot at Wytham Woods, Oxford, UK” by Butt et al. (2009).

LAStools processing:
1) check LiDAR quality as described in these videos and articles [lasinfo, lasvalidate, lasoverlap, lasgrid, las2dem]
2) classify into ground and non-ground points using tile-based processing  [lastile, lasground]
3) generate a Digital Terrain Model (DTM) [las2dem]
4) compute height of points and delete points higher than maximum tree height obtained from terrestrial LiDAR [lasheight]
5) convert points into disks with 10 cm diameter to conservatively account for laser beam width [lasthin]
6) generate spike-free Digital Surface Model (DSM) based on algorithm by Khosravipour et al. (2016) [las2dem]
7) create Canopy Height Model (CHM) by subtracting DTM from spike-free DSM [lasheight].

References:
Butt, N., Campbell, G., Malhi, Y., Morecroft, M., Fenn, K., & Thomas, M. (2009). Initial results from establishment of a long-term broadleaf monitoring plot at Wytham Woods, Oxford, UK. University Oxford, Oxford, UK, Rep.
Khosravipour, A., Skidmore, A.K., Isenburg, M., Wang, T.J., Hussin, Y.A., (2014). Generating pit-free Canopy Height Models from Airborne LiDAR. PE&RS = Photogrammetric Engineering and Remote Sensing 80, 863-872.
Khosravipour, A., Skidmore, A.K., Isenburg, M. and Wang, T.J. (2015) Development of an algorithm to generate pit-free Digital Surface Models from LiDAR, Proceedings of SilviLaser 2015, pp. 247-249, September 2015.
Khosravipour, A., Skidmore, A.K., Isenburg, M (2016) Generating spike-free Digital Surface Models using raw LiDAR point clouds: a new approach for forestry applications, (journal manuscript under review).

Removing Excessive Low Noise from Dense-Matching Point Clouds

Point clouds produced with dense-matching by photogrammetry software such as SURE, Pix4D, or Photoscan can include a fair amount of the kind of “low noise” as seen below. Low noise causes trouble when attempting to construct a Digital Terrain Model (DTM) from the points as common algorithm for classifying points into ground and non-ground points – such as lasground – tend to “latch onto” those low points, thereby producing a poor representation of the terrain. This blog post describes one possible LAStools workflow for eliminating excessive low noise. It was developed after a question in the LAStools user forum by LASmoons holder Muriel Lavy who was able to share her noisy data with us. See this, this, this, thisthis, and this blog post for further reading on this topic.

Here you can download the dense matching point cloud that we are using in the following work flow:

We leave the usual inspection of the content with lasinfolasview, and lasvalidate that we always recommend on newly obtained data as an exercise to the reader. Note that a check for proper alignment of flightlines with lasoverlap that we consider mandatory for LiDAR data is not applicable for dense-matching points.

With lastile we turn the original file with 87,261,083 points into many smaller 500 by 500 meter tiles for efficient multi-core processing. Each tile is given a 25 meter buffer to avoid edge artifacts. The buffer points are marked as withheld for easier on-the-fly removal. We add a (terser) description of the WGS84 UTM zone 32N to each tile via the corresponding EPSG code 32632:
lastile -i muriel\20161127_Pancalieri_UTM.laz ^
        -tile_size 500 -buffer 25 -flag_as_withheld ^
        -epsg 32632 ^
        -odir muriel\tiles_raw -o panca.laz
Because dense-matching points often have a poor point order in the files they get delivered in we use lassort to rearrange them into a space-filling curve order as this will speed up most following processing steps:
lassort -i muriel\tiles_raw\panca*.laz ^
        -odir muriel\tiles_sorted -olaz ^
        -cores 7
We then run lasthin to reclassify the highest point of every 2.5 by 2.5 meter grid cell with classification code 8. As the spacing of the dense-matched points is around 40 cm in both x and y, around 40 points will fall into each such grid cell from which the highest is then classified as 8:
lasthin -i muriel\tiles_sorted\panca*.laz ^
        -step 2.5 ^
        -highest -classify_as 8 ^
        -odir muriel\tiles_thinned -olaz ^
        -cores 7
Considering only those points classified as 8 in the last step we then run lasnoise to find points that are highly isolated in wide and flat neighborhoods that are then reclassified as 7. See the README file of lasnoise for a detailed explanation of the different parameters:
lasnoise -i muriel\tiles_thinned\panca*.laz ^
         -ignore_class 0 ^
         -step_xy 5 -step_z 0.1 -isolated 4 ^
         -classify_as 7 ^
         -odir muriel\tiles_isolated -olaz ^
         -cores 7
Now we run a temporary ground classification of only (!!!) on those points that are still classified as 8 using the default parameters of lasground. Hence we only use the points that were the highest points on the 2.5 by 2.5 meter grid and that were not classified as noise in the previous step. See the README file of lasground for a detailed explanation of the different parameters:
lasground -i muriel\tiles_isolated\panca*.laz ^
          -city -ultra_fine -ignore_class 0 7 ^
          -odir muriel\tiles_temp_ground -olaz ^
          -cores 7
The result of this temporary ground filtering is then merely used to mark all points that are 0.5 meter below the triangulated TIN of these temporary ground points with classification code 12 using lasheight. See the README file of lasheight for a detailed explanation of the different parameters:
lasheight -i muriel\tiles_temp_ground\panca*.laz ^
          -do_not_store_in_user_data ^
          -classify_below -0.5 12 ^
          -odir muriel\tiles_temp_denoised -olaz ^
          -cores 7
In the resulting tiles the low noise (but also many points above the ground) are now marked and in a final step we produce properly classified denoised tiles by re-mapping the temporary classification codes to conventions that are more consistent with the ASPRS LAS specification using las2las:
las2las -i muriel\tiles_temp_denoised\panca*.laz ^
        -change_classification_from_to 1 0 ^
        -change_classification_from_to 2 0 ^
        -change_classification_from_to 7 0 ^
        -change_classification_from_to 12 7 ^
        -odir muriel\tiles_denoised -olaz ^
        -cores 7
Let us visually check what each of the above steps has produced by zooming in on a 300 meter by 100 meter strip of points with the bounding box (388500,4963125) to (388800,4963225) in tile ‘panca_388500_4963000.laz’:
The final classification of all points that are not already classified as noise (7) into ground (2) or non-ground (1) was done with a final run of lasground. See the README file of lasground for a detailed explanation of the different parameters:
lasground -i muriel\tiles_denoised\panca*.laz ^
          -ignore_class 7 ^
          -city -ultra_fine ^
          -odir muriel\tiles_ground -olaz ^
          -cores 7
Then we create a seamless hill-shaded DTM tiles by triangulating all the points classified as ground into a temporary TIN (including those in the 25 meter buffer) and then rasterizing only the inner 500 meter by 500 meter of each tile with option ‘-use_tile_bb’ of las2dem. For more details on the importance of buffers in tile-based processing see this blog post here.
las2dem -i muriel\tiles_ground\panca*.laz ^
        -keep_class 2 ^
        -step 1 -hillshade ^
        -use_tile_bb ^
        -odir muriel\tiles_dtm -opng ^
        -cores 7

And here the original DSM side-by-side with resulting DTM after low noise removal. One dense forested area near the center of the data was not entirely removed due to the lack of ground points in this area. Integrating external ground points or manual editing with lasview are two possible way to rectify these few remaining errors …

Integrating External Ground Points in Forests to Improve DTM from Dense-Matching Photogrammetry

The biggest problem of generating a Digital Terrain Model (DTM) from the photogrammetric point clouds that are produced from aerial imagery with dense-matching software such as SURE, Pix4D, or Photoscan is dense vegetation: when plants completely cover the terrain not a single point is generated on the ground. This is different for LiDAR point clouds as the laser can even penetrate dense multi-level tropical forests. The complete lack of ground points in larger vegetated areas such as closed forests or dense plantations means that the many processing workflows for vegetation analysis that have been developed for LiDAR cannot be used for photogrammetric point clouds  … unless … well unless we are getting those missing ground points some other way. In the following we see how to integrate external ground points to generate a reasonable DTM under a dense forest with LAStools. See this, this, this, this, and this article for further reading.

Here you can download the dense matching point cloud, the manually collected ground points, and the forest stand delineating polygon that we are using in the following example work flow:

We leave the usual inspection of the content with lasinfo and lasview that we always recommend on newly obtained data as an exercise to the reader. Using las2dem and lasgrid we created the Google Earth overlays shown above to visualize the extent of the dense matched point cloud and the distribution of the manually collected ground points:

las2dem -i DenseMatching.laz ^
        -thin_with_grid 1.0 ^
        -extra_pass ^
        -step 2.0 ^
        -hillshade ^
        -odix _hill_2m -opng

lasgrid -i ManualGround.laz ^
        -set_RGB 255 0 0 ^
        -step 10 -rgb ^
        -odix _grid_10m -opng

Attempts to ground-classify the dense matching point cloud directly are futile as there are no ground points under the canopy in the heavily forested area. Therefore 558 ground points were manually surveyed in the forest of interest that are around 50 to 120 meters apart from another. We show how to integrate these points into the dense matching point cloud such that we can successfully extract bare-earth information from the data.

In the first step we “densify” the manually collected ground points by interpolating them with triangles onto a raster of 2 meter resolution that we store as LAZ points with las2dem. You could consider other interpolation schemes to “densify” the ground points, here we use simple linear interpolation to prove the concept. Due to the varying distance between the manually surveyed ground points we allow interpolating triangles with edge lengths of up to 125 meters. These triangles then also cover narrow open areas next to the forest, so we clip the interpolated ground points against the forest stand delineating polygon with lasclip to classify those points that are really in the forest as “key points” (class 8) and all others as “noise” (class 7).

las2dem -i ManualGround.laz ^
        -step 2 ^
        -kill 125 ^
        -odix _2m -olaz

lasclip -i ManualGround_2m.laz ^
        -set_classification 7 ^ 
        -poly forest.shp ^
        -classify_as 8 -interior ^
        -odix _forest -olaz

Below we show the resulting densified ground points colored by elevation that survive the clipping against the forest stand delineating polygon and were classified as “key points” (class 8). The interpolated ground points in narrow open areas next to the forest that fall outside this polygon were classified as “noise” (class 7) and are shown in violet. They will be dropped in the next step.

We then merge the dense matching points with the densified manual ground points (while dropping all the violet points marked as noise) as input to lasthin and reclassify the lowest point per 1 meter by 1 meter with a temporary code (here we use class 9 that usually refers to “water”). Only the subset of lowest points that receives the temporary classification code 9 will be used for ground classification later.

lasthin -i DenseMatching.laz ^
        -i ManualGround_2m_forest.laz ^
        -drop_class 7 ^
        -merged ^
        -lowest -step 1 -classify_as 9 ^
        -o DenseMatchingAndDensifiedGround.laz

We use the GUI of lasview to pick several interesting areas for visual inspection. The selected points load much faster when the LAZ file is spatially indexed and therefore we first run lasindex. For better orientation we also load the forest stand delineating polygon as an overlay into the GUI.

lasindex -i DenseMatchingAndDensifiedGround.laz 

lasview -i DenseMatchingAndDensifiedGround.laz -gui

We pick the area shown below that contains the target forest with manually collected and densified ground points and a forested area with only dense matching points. The difference could not be more drastic as the visualizations show.

Now we run ground classification using lasground with option ‘-town’ using only the points with the temporary code 9 by ignoring all other classifications 0 and 8 in the file. We leave the temporary classification code 9 unchanged for all the points that were not classified with “ground” code 2 so we can visualize later which ones those are.

lasground -i DenseMatchingAndDensifiedGround.laz ^
          -ignore_class 0 8 ^
          -town ^
          -non_ground_unchanged ^
          -o GroundClassified.laz

We again use the GUI of lasview to pick several interesting areas after running lasindex and again load the forest stand delineating polygon as an overlay into the GUI.

lasindex -i GroundClassified.laz 

lasview -i GroundClassified.laz -gui

We pick the area shown below that contains all three scenarios: the target forest with manually collected and densified ground points, an open area with only dense matching points, and a forested area with only dense matching points. The result is as expected: in the target forest the manually collected ground points are used as ground and in the open area the dense-matching points are used as ground. But there is no useful ground in the other forested area.

Now we can compute the heights of the points above ground for our target forest with lasheight and either replace the z elevations in the file of store them separately as “extra bytes”. Then we can compute, for example, a Canopy Height Model (CHM) that color codes the height of the vegetation above the ground with lasgrid. Of course this will only be correct in the target forest where we have “good” ground but not in the other forested areas. We also compute a hillshaded DTM to be able to visually inspect the topography of the generated terrain model.

lasheight -i GroundClassified.laz ^
          -store_as_extra_bytes ^
          -o GroundClassifiedWithHeights.laz

lasgrid -i GroundClassifiedWithHeights.laz ^
        -step 2 ^
        -highest -attribute 0 ^
        -false -set_min_max 0 25 ^
        -o chm.png

las2dem -i GroundClassified.laz ^
        -keep_class 2 -extra_pass ^
        -step 2 ^ 
        -hillshade ^
        -o dtm.png

Here you can download the resulting color-coded CHM and the resulting hill-shaded DTM as Google Earth KMZ overlays. Clearly the resulting CHM is only meaningful in the target forest where we used the manually collected ground points to create a reasonable DTM. In the other forested areas the ground is only correct near the forest edges and gets worse with increasing distance from open areas. The resulting DTM exhibits some interesting looking  bumps in the middle of areas with manually collected ground point. Those are a result of using the dense-matching points as ground whenever their elevation is lower than that of the manually collected points (which is decided in the lasthin step). Whether those bumps represent true elevations of are artifacts of low erroneous elevation from dense-matching remains to be investigated.

For forests on complex and steep terrain the number of ground points that needs to be manually collected may make such an approach infeasible in practice. However, maybe you have another source of elevation, such as a low-resolution DTM of 10 or 25 meter provided by your local government. Or maybe even a high resolution DTM of 1 or 2 meter from a LiDAR survey you did several years ago. While the forest may have grown a lot in the past years, the ground under the forest will probably not have changed much …