Estonia leads in Open LiDAR: nationwide & multi-temporal Point Clouds now Online

At the beginning of July 2018 the Baltic country of Estonia – with an area of 45 thousand square kilometers inhabited by around 1.3 million people – opened much of their geospatial data archives and is now offering easy and free download of LiDAR point clouds nationwide via a portal of the Estonian Land Board. What is even more exciting is that multi-temporal data sets flown in different years and seasons are available. Raw LiDAR point clouds collected either during a “regular flight in spring” or during a “forestry flight in summer” can be obtained for multiple years. The 1 km by 1 km tile with map sheet index 377650, for example, is available for four different LiDAR surveys carried out in spring 2011, summer 2013, spring 2015 and summer 2017. This offers incredible potential for studying temporal changes of man-made or natural environments. The screenshot sequence below shows how to navigate to the download site starting from this page.

We found out about this open data release during our hands-on workshop on LiDAR and photogrammetry point cloud processing with LAStools that was part of the UAV remote sensing summer school in Tartu, Estonia. See our calendar for upcoming events or contact us for holding a similar event at your university, agency, company, or conference. 

The LiDAR data provided on the download portal is compressed with LASzip and provided as 1km by 1km LiDAR tiles in LAZ format. You can search for these tiles via their Estonian 1:2000 map sheet indices. To find out which map sheet index corresponds to the tile you are interested in you can overlay the maps sheet indices over an online map. However, you will need to zoom in before you can see the indices as illustrated in the screenshot sequence below. Here a zoom to the map sheet indices for the area that we visited during the social event of the summer school.

One thing we noticed is that the tiles contain only a single layer of points. The overlaps between flightlines were removed which results in a more uniform point density but strips the user of the possibility to do their own flightline alignment checks with lasoverlap. Below you see the spring 2014 acquisition for the tile with map sheet index 475861 colored by classification, elevation, return type, flightline ID and intensity.

The license for the open data of the Estonian Land Board is very permissive and can be found here. Agreeing to the licence gives the licence holder the rights to use data free of charge for an unspecified term, to good purpose in accordance with law and best practice. Licence holder may produce derivatives of data, combine data with its own products or services, use data for commercial or non-commercial purposes and redistribute data. The licence holder obliges to refer to the origin of data when publishing and redistributing data. The reference must include the name of the licensor, the title of data, the age of data (or the date of data extraction).

Complete LiDAR Processing Pipeline: from raw Flightlines to final Products

This tutorial serves as an example for a complete end-to-end workflow that starts with raw LiDAR flightlines (as they may be delivered by a vendor) to final classified LiDAR tiles and derived products such as raster DTM, DSM, and SHP files with contours, building footprint and vegetation layers. The three example flightlines we are using here were flown in Ayutthaya, Thailand with a RIEGL LMS Q680i LiDAR scanner by Asian Aerospace Services who are based at the Don Mueang airport in Bangkok from where they are serving South-East-Asia and beyond. You can download them here:

Quality Checking

The minimal quality checks consist of generating textual reports (lasinfo & lasvalidate), inspecting the data visually (lasview), making sure alignment and overlap between flightlines fulfill expectations (lasoverlap), and measuring pulse density per square meter (lasgrid). Additional checks for points replication (lasduplicate), completeness of all returns per pulse (lasreturn), and validation against external ground control points (lascontrol) may also be performed.

lasinfo -i Ayutthaya\strips_raw\*.laz ^
        -cd ^
        -histo z 5 ^
        -histo intensity 64 ^
        -odir Ayutthaya\quality -odix _info -otxt ^
        -cores 3

lasvalidate -i Ayutthaya\strips_raw\*.laz ^
            -o Ayutthaya\quality\validate.xml

The lasinfo report generated with the command line shown computes the average density for each flightline and also generates two histograms, one for the z coordinate with a bin size of 5 meter and one for the intensity with a bin size of 64. The resulting textual descriptions are output into the specified quality folder with an appendix ‘_info’ added to the original file name. Perusing these reports tells you that there are up to 7 returns per pulse, that the average pulse density per flightline is between 7.1 to 7.9 shots per square meter, that the point source IDs of the points are already populated correctly, that there are isolated points far above and far below the scanned area, and that the intensity values range from 0 to 1023 with the majority being below 400. The warnings in the lasinfo and the lasvalidate reports about the presence of return numbers 6 and 7 have to do with the history of the LAS format and can safely be ignored.

lasoverlap -i Ayutthaya\strips_raw\*.laz ^
           -files_are_flightlines ^
           -min_diff 0.1 -max_diff 0.3 ^
           -odir Ayutthaya\quality -o overlap.png

This results in two color illustrations. One image shows the flightline overlap with blue indicating one flightline, turquoise indicating two, and yellow indicating three flightlines. Note that wet areas (rivers, lakes, rice paddies, …) without LiDAR returns affect this visualization. The other image shows how well overlapping flightlines align. Their vertical difference is color coded with while meaning less than 10 cm error while saturated red and blue indicate areas with more than 30 cm positive or negative difference.

One pixel wide red and blue along building edges and speckles of red and blue in vegetated areas are normal. We need to look-out for large systematic errors where terrain features or flightline outlines become visible. If you click yourself through this photo album you will eventually see typical examples (make sure to read the comments too). One area slightly below the center looks suspicious. We load the PNG into the GUI to pick this area for closer inspection with lasview.

lasview -i Ayutthaya\strips_raw\*.laz -gui

Why these flightline differences exist and whether they are detrimental to your purpose are questions that you will have to explore further. For out purpose this isolated difference was noted but will not prevent us from proceeding further. Next we want to investigate the pulse density. We do this with lasgrid. We know that the average last return density per flightline is between 7.1 to 7.9 shots per square meter. We set up the false color map for lasgrid such that it is blue when the last return density drops to 5 shots (or less) per square meter and such that it is red when the last return density reaches 10 shots (or more).

lasgrid -i Ayutthaya\strips_raw\*.laz -merged ^
        -keep_last ^
        -step 2 -density ^
        -false -set_min_max 4 8 ^
        -odir Ayutthaya\quality -o density_4ppm_8ppm.png

The last return density per square meter mapped to a color: blue is 5 or less, red is 10 or more.

The last return density image clearly shows how the density increases to over 10 pulses per square meter in all areas of flightline overlap. However, as there are large parts covered by only one flightline their density is the one that should be considered. We note great variations in density along the flightlines. Those have to do with aircraft movement and in particular with the pitch. When the nose of the plane goes up even slightly, the gigantic “fan of laser pulses” (that can be thought of as being rigidly attached at the bottom perpendicular to the aircraft flight direction) is moving faster forward on the ground far below and therefore covers it with fewer shots per square meter. Conversely when the nose of the plane goes down the spacing between scan lines far below the plane are condensed so that the density increases. Inclement weather and the resulting airplane pitch turbulence can have a big impact on how regular the laser pulse spacing is on the ground. Read this article for more on LiDAR pulse density and spacing.

LiDAR Preparation

When you have airborne LiDAR in flightlines the first step is to tile the data into square tiles that are typically 1000 by 1000 or – for higher density surveys – 500 by 500 meters in size. This can be done with lastile. We also add a buffer of 30 meters to each tile. Why buffers are important for tile-based processing is explained here. We choose 30 meters as this is larger than any building we expect in this area and slightly larger than the ‘-step’ size we use later when classifying the points into ground and non-ground points with lasground.

lastile -i Ayutthaya\strips_raw\*.laz ^
        -tile_size 500 -buffer 30 -flag_as_withheld ^
        -odir Ayutthaya\tiles_raw -o ayu.laz

NOTE: Usually you will have to add ‘-files_are_flightlines’ or ‘-apply_file_source_ID’ to the lastile command shown above in order to preserve the information which points is from which flightline. We do not have to do this here as evident from the lasinfo reports we generated earlier. Not only is the file source ID in the LAS header is correctly set to 2, 3, or 4 reflecting the intended flightline numbering evident from the file names. Also the point source ID of each point is already set to the correct value 2, 3, or 4. For more info see this or this discussion from the LAStools user forum.

Next we classify isolated points that are far from most other points with lasnoise into the (default) classification code 7. See the README file for the meaning of the parameters and play around with different setting to get a feel for how to make this process more or less aggressive.

lasnoise -i Ayutthaya\tiles_raw\ayu*.laz ^
         -step_xy 4 -step_z 2 -isolated 5 ^
         -odir Ayutthaya\tiles_denoised -olaz ^
         -cores 4

Especially for ground classification it is important that low noise points are excluded. You should inspect a number of tiles with lasview to make sure the low noise are all pink now if you color them by classification.

lasview -i Ayutthaya\tiles_denoised\ayu*.laz -gui

While the algorithms in lasground are designed to withstand a few noise points below the ground, you will find that it will include them into the ground model if there are too many of them. Hence, it is important to tell lasground to ignore these noise points. For the other parameters used see the README file of lasground.

lasground -i Ayutthaya\tiles_denoised\ayu*.laz ^
          -ignore_class 7 ^
          -city -ultra_fine ^
          -compute_height ^
          -odir Ayutthaya\tiles_ground -olaz ^
          -cores 4

You should visually check the resulting ground classification of each tile with lasview by selecting smaller subsets (press ‘x’, draw a rectangle, press ‘x’ again, use arrow keys to walk) and then switch back and forth between a triangulation of the ground points (press ‘g’ and then press ‘t’) and a triangulation of last returns (press ‘l’ and then press ‘t’). See the README of lasview for more information on those hotkeys.

lasview -i Ayutthaya\tiles_ground\ayu*.laz -gui

This way I found at least one tile that should be reclassified with ‘-metro’ instead of ‘-city’ because it still contained one large building in the ground classification. Alternatively you can correct miss-classifications manually using lasview as shown in the next few screen shots.

This is an optional step for advanced users who have a license. In case you managed to do such a manual edit and saved it as a LAY file using LASlayers (see the README file of lasview) you can overwrite the old file with:

laslayers -i Ayutthaya\tiles_ground\ayu_669500_1586500.laz -ilay ^
          -o Ayutthaya\tiles_ground\ayu_669500_1586500_edited.laz

move Ayutthaya\tiles_ground\ayu_669500_1586500_edit.laz ^
     Ayutthaya\tiles_ground\ayu_669500_1586500.laz

The next step classifies those points that are neither ground (2) nor noise (7) into building (or rather roof) points (class 6) and high vegetation points (class 5). For this we use lasclassify with the default parameters that only considers points that are at least 2 meters above the classified ground points (see the README for details on all available parameters).

lasclassify -i Ayutthaya\tiles_ground\ayu*.laz ^
            -ignore_class 7 ^
            -odir Ayutthaya\tiles_classified -olaz ^
            -cores 4

We  check the classification of each tile with lasview by selecting smaller subsets (press ‘x’, draw a rectangle, press ‘x’ again) and by traversing with the arrow keys though the point cloud. You will find a number of miss-classifications. Boats are classified as buildings, water towers or complex temple roofs as vegetation, … and so on. You could use lasview to manually correct any classifications that are really important.

lasview -i Ayutthaya\tiles_classified\ayu*.laz -gui

Before delivering the classified LiDAR tiles to a customer or another user it is imperative to remove the buffers that were carried through all computations to avoid artifacts along the tile boundary. This can also be done with lastile.

lastile -i Ayutthaya\tiles_classified\ayu*.laz ^
        -remove_buffer ^
        -odir Ayutthaya\tiles_final -olaz ^
        -cores 4

Together with the tiling you may want to deliver a tile overview file in KML format (or in SHP format) that you can easily generate with lasboundary using this command line:

lasboundary -i Ayutthaya\tiles_final\ayu*.laz ^
            -use_bb ^
            -overview -labels ^
            -o Ayutthaya\tiles_overview.kml

The small KML file generated b lasboundary provides a quick overview where tiles are located, their names, their bounding box, and the number of points they contain.

Derivative production

The most common derivative product produced from LiDAR data is a Digital Terrain Model (DTM) in form of an elevation raster. This can be obtained by interpolating the ground points with a triangulation (i.e. a Delaunay TIN) and by sampling the TIN at the center of each raster cell. The pulse density of well over 4 shots per square meter definitely supports a resolution of 0.5 meter for the raster DTM. From the ground-classified tiles with buffer we compute the DTM using las2dem as follows:

las2dem -i Ayutthaya\tiles_ground\ayu*.laz ^
        -keep_class 2 ^
        -step 0.5 -use_tile_bb ^
        -odir Ayutthaya\tiles_dtm -obil ^
        -cores 4

It’s important to add ‘-use_tile_bb’ to the command line which limits the raster generation to the original tile sizes of 500 by 500 meters in order not to rasterize the buffers that are extending the tiles 30 meters in each direction. We used the BIL format so that we inspect the resulting elevation rasters with lasview:

lasview -i Ayutthaya\tiles_dtm\ayu*.bil -gui

To create a hillshaded version of the DTM you can use your favorite raster processing package such as GDAL or GRASS but you could also use the BLAST extension of LAStools and create a large seamless image with blast2dem as follows:

blast2dem -i Ayutthaya\tiles_dtm\ayu*.bil -merged ^
          -step 0.5 -hillshade -epsg 32647 ^
          -o Ayutthaya\dtm_hillshade.png

Because blast2dem does not parse the PRJ files that accompany the BIL rasters we have to specify the EPSG code explicitly to also get a KML file that allows us to visualize the LiDAR in Google Earth.

A a hillshading of the merged DTM rasters produced with blast2dem.

Next we generate a Digital Surface Model (DSM) that includes the highest objects that the laser has hit. We use the spike-free algorithm that is implemented in las2dem that creates a triangulation of the highest returns as follows:

las2dem -i Ayutthaya\tiles_denoised\ayu*.laz ^
        -drop_class 7 ^
        -step 0.5 -spike_free 1.2 -use_tile_bb ^
        -odir Ayutthaya\tiles_dsm -obil ^
        -cores 4

We used 1.0 as the freeze value for the spike free algorithm because this is about three times the average last return spacing reported in the individual lasinfo reports generated during quality checking. Again we inspect the resulting rasters with lasview:

lasview -i Ayutthaya\tiles_dsm\ayu*.bil -gui

For reason of comparison we also generate the DSM rasters using a simple first-return interpolation again with las2dem as follows:

las2dem -i Ayutthaya\tiles_denoised\ayu*.laz ^
        -drop_class 7 -keep_first ^
        -step 0.5 -use_tile_bb ^
        -odir Ayutthaya\tiles_dsm -obil ^
        -cores 4

A few direct side-by-side comparison between a spike-free DSM and a first-return DSM shows the difference that are especially noticeable along building edges and in large trees.

Another product that we can easily create are building footprints from the automatically classified roofs using lasboundary. Because the tool is quite scalable we can simply on-the-fly merge the final tiles. This also avoids including duplicate points from the tile buffer whose classifications are also often less accurate.

lasboundary -i Ayutthaya\tiles_final\ayu*.laz -merged ^
            -keep_class 6 ^
            -disjoint -concavity 1.1 ^
            -o Ayutthaya\buildings.shp

Similarly we can use lasboundary to create a vegetation layer from those points that were automatically classified as high vegetation.

lasboundary -i Ayutthaya\tiles_final\ayu*.laz -merged ^
             -keep_class 5 ^
             -disjoint -concavity 3 ^
             -o Ayutthaya\vegetation.shp

We can also produce 1.0 meter contour lines from the ground classified points. However, for nicer contours it is beneficial to first generate a subset of the ground points with lasthin using option ‘-contours 1.0’ as follows:

lasthin -i Ayutthaya\tiles_final\ayu*.laz ^
        -keep_class 2 ^
        -step 1.0 -contours 1.0 ^
        -odir Ayutthaya\tiles_temp -olaz ^
        -cores 4

We then merge all subsets of ground points from those temporary tiles on-the-fly into one (using the ‘-merged’ option) and let blast2iso from the BLAST extension of LAStools generate smoothed and simplified 1 meter contours as follows:

blast2iso -i Ayutthaya\tiles_temp\ayu*.laz -merged ^
          -iso_every 1.0 ^
          -smooth 2 -simplify_length 0.5 -simplify_area 0.5 -clean 5.0 ^
          -o Ayutthaya\contours_1m.shp

Finally we composite all of our derived LiDAR products into one map using QGIS and then fuse it with data from OpenStreetMap that we’ve downloaded from BBBike. Here you can download the OSM data that we use.

It’s in particular interesting to compare the building footprints that were automatically derived from our LiDAR processing pipeline with those mapped by OpenStreetMap volunteers. We immediately see that there is a lot of volunteering work left to do and the LiDAR-derived data can assist us in directing those mapping efforts. A closer look reveals the (expected) quality difference between hand-mapped and auto-generated data.

The OSM buildings are simpler. These polygons are drawn and divided into logical units by a human. They are individually verified and correspond to actual buildings. However, they are less aligned with the Google Earth satellite image. The LiDAR-derived buildings footprints are complex because lasboundary has no logic to simplify the complicated polygonal chains that enclose the points that were automatically classified as roof into rectilinear shapes or to break directly adjacent roof points into separate logical units. However, most buildings are found (but also objects that are not buildings) and their geospatial alignment is as good as that of the LiDAR data.

New Step-by-Step Tutorial for Velodyne Drone LiDAR from Snoopy by LidarUSA

The folks from Harris Aerial gave us LiDAR data from a test-flight of one of their drones, the Carrier H4 Hybrid HE (with a 5kg maximum payload and a retail price of US$ 28,000), carrying a Snoopy A series LiDAR system from LidarUSA in the countryside near Huntsville, Alabama. The laser scanner used by the Snoopy A series is a Velodyne HDL 32E that has 32 different laser/detector pairs that fire in succession to scan up to 700,000 points per second within a range of 1 to 70 meters. You can download the raw LiDAR file from the 80 second test flight here. As always, the first thing we do is to visualize the file with lasview and to generate a textual report of its contents with lasinfo.

lasview -i Velodyne001.laz -set_min_max 680 750

It becomes obvious that the drone must have scanned parts of itself (probably the landing gear) during the flight and we exploit this fact in the later processing. The information which of the 32 lasers was collecting which point is stored into the ‘point source ID’ field which is usually used for the flightline information. This results in a psychedelic look in lasview as those 32 different numbers get mapped to the 8 different colors that lasview uses for distinguishing flightlines.

The lasinfo report we generate computes the average point density with ‘-cd’ and includes histograms for a number of point attributes, namely for ‘user data’, ‘intensity’, ‘point source ID’, ‘GPS time’, and ‘scan angle rank’.

lasinfo -i Velodyne001.laz ^
        -cd ^
        -histo user_data 1 ^
        -histo point_source 1 ^
        -histo intensity 16 ^
        -histo gps_time 1 ^
        -histo scan_angle_rank 5 ^
        -odir quality -odix _info -otxt

You can download the resulting report here and it will tell you that the information which of the 32 lasers was collecting which point was stored both into the ‘user data’ field and into the ‘point source ID’ field. The warnings you see below have to do with the fact that the double-precision bounding box stored in the LAS header was populated with numbers that have many more decimal digits than the coordinates in the file, which only have millimeter (or millifeet) resolution as all three scale factors are 0.001 (meaning three decimal digits).

WARNING: stored resolution of min_x not compatible with x_offset and x_scale_factor: 2171988.6475160527
WARNING: stored resolution of min_y not compatible with y_offset and y_scale_factor: 1622812.606925504
WARNING: stored resolution of min_z not compatible with z_offset and z_scale_factor: 666.63504345017589
WARNING: stored resolution of max_x not compatible with x_offset and x_scale_factor: 2172938.973065129
WARNING: stored resolution of max_y not compatible with y_offset and y_scale_factor: 1623607.5209975131
WARNING: stored resolution of max_z not compatible with z_offset and z_scale_factor: 1053.092674726669

Both the “return number” and the “number of returns” attribute of every points in the file is 2. This makes it appear as if the file would only contain the last returns of those laser shots that produced two returns. However, as the Velodyne HDL 32E only produces one return per shot we can safely conclude that those numbers should all be 1 instead of 2 and that this is just a small bug in the export software. We can easily fix this with las2las.

reporting minimum and maximum for all LAS point record entries ...
[...]
 return_number 2 2
 number_of_returns 2 2
[...]

The lasinfo report lacks information about the coordinate reference system as there is no VLR that stores projection information. Harris Aerial could not help us other than telling us that the scan was near Huntsville, Alamaba. Measuring certain distances in the scene like the height of the house or the tree suggests that both horizontal and vertical units are in feet, or rather in US survey feet. After some experimenting we find that using EPSG 26930 for NAD83 Alabama West but forcing the default horizontal units from meters to US survey feet gives a result that aligns well with high-resolution Google Earth imagery as you can see below:

lasgrid -i flightline1.laz ^
        -i flightline2.laz ^
        -merged ^
        -epsg 26930 -survey_feet ^
        -step 1 -highest ^
        -false -set_min_max 680 750 ^
        -o testing26930usft.png

Using EPSG code 26930 but with US survey feet instead of meters results in nice alignment with GE imagery.

We use the fact that the drone has scanned itself to extract an (approximate) trajectory by isolating those LiDAR returns that have hit the drone. Via a visual check with lasview (by hovering with the cursor over the lowest drone hits and pressing hotkey ‘i’) we determine that the lowest drone hits are all above 719 feet. We use two calls to las2las to split the point cloud vertically. In the same call we also change the resolution from three to two decimal digits, fix the return number issue, and add the missing geo-referencing information:

las2las -i Velodyne001.laz ^
        -rescale 0.01 0.01 0.01 ^
        -epsg 26930 -survey_feet -elevation_survey_feet ^
        -set_return_number 1 ^
        -set_number_of_returns 1 ^
        -keep_z_above 719 ^
        -odix _above719 -olaz

las2las -i Velodyne001.laz ^
        -rescale 0.01 0.01 0.01 ^
        -epsg 26930 -survey_feet -elevation_survey_feet ^
        -set_return_number 1 ^
        -set_number_of_returns 1 ^
        -keep_z_below 719 ^
        -odix _below719 -olaz

We then use the manual editing capabilities of lasview to change the classifications of the LiDAR points that correspond to drone hits from 1 to 12, which is illustrated by the series of screen shots below.

lasview -i Velodyne001_above719.laz

The workflow illustrated above results in a tiny LAY file that is part of the LASlayers functionality of LAStools. It only encodes the few changes in classifications that we’ve made to the LAZ file without re-writing those parts that have not changed. Those interested may use laslayers to inspect the structure of the LAY file:

laslayers -i Velodyne001_above719.laz

We can apply the LAY file on-the-fly with the ‘-ilay’ option, for example, when running lasview:

lasview -i Velodyne001_above719.laz -ilay

To separate the drone-hit trajectory from the remaining points we run lassplit with the ‘-ilay’ option and request to split by classification with this command line:

lassplit -i Velodyne001_above719.laz -ilay ^
         -by_classification -digits 3 ^
         -olaz

This gives us two new files with the three-digit appendices ‘_001’ and ‘_012’. The latter one contains those points we marked as being part of the trajectory. We now want to use lasview to – visually – find a good moment in time where to split the trajectory into multiple flightlines. The lasinfo report tells us that the GPS time stamps are in the range from 418,519 to 418,602. In order to use the same trick as we did in our recent article on processing LiDAR data from the Hovermap Drone, where we mapped the GPS time to the intensity for querying it via lasview, we first need to subtract a large number from the GPS time stamps to bring them into a suitable range that fits the intensity field as done here.

lasview -i Velodyne001_above719_012.laz ^
        -translate_gps_time -418000 ^
        -bin_gps_time_into_intensity 1
        -steps 5000

The ‘-steps 5000’ argument makes for a slower playback (press ‘p’ to repeat) to better follow the trajectory.

Hovering with the mouse over a point that – visually – seems to be one of the turning points of the drone and pressing ‘i’ on the keyboard shows an intensity value of 548 which corresponds to the GPS time stamp 418548, which we then use to split the LiDAR point cloud (without the trajectory) into two flightlines:

las2las -i Velodyne001_below719.laz ^
        -i Velodyne001_above719_001.laz ^
        -merged ^
        -keep_gps_time_below 418548 ^
        -o flightline1.laz

las2las -i Velodyne001_below719.laz ^
        -i Velodyne001_above719_001.laz ^
        -merged ^
        -keep_gps_time_above 418548 ^
        -o flightline2.laz

Next we use lasoverlap to check how well the LiDAR points from the flight out and the flight back align vertically. This tool computes the difference of the lowest points for each square foot covered by both flightlines. Differences of less than a quarter of a foot are mapped to white, differences of more than half a foot are mapped to saturated red or blue depending on whether the difference is positive or negative:

lasoverlap -i flightline1.laz ^
           -i flightline2.laz ^
           -faf ^
           -min_diff 0.25 -max_diff 0.50 -step 1 ^
           -odir quality -o overlap_025_050.png

We then use a new feature of the LAStools GUI (as of version 180429) to closer inspect larger red or blue areas. We want to use lasmerge and clip out any region that looks suspect for closer examination with lasview. We start the tool in the GUI mode with the ‘-gui’ command and the two flightlines pre-loaded. Using the new PNG overlay roll-out on the left we add the ‘overlap_025_050_diff.png’ image from the quality folder created in the last step and clip out three areas.

lasmerge -i flightline1.laz -i flightline2.laz -gui

You can also clip out these three areas using the command lines below:

lasmerge -i flightline1.laz -i flightline2.laz ^
         -faf ^
         -inside 2172500 1623160 2172600 1623165 ^
         -o clip2500_3160_100x005.laz

lasmerge -i flightline1.laz -i flightline2.laz ^
         -faf ^
         -inside 2172450 1623450 2172550 1623455 ^
         -o clip2450_3450_100x005.laz

lasmerge -i flightline1.laz -i flightline2.laz ^
         -faf ^
         -inside 2172430 1623290 2172530 1623310 ^
         -o clip2430_3290_100x020.laz

A closer inspection of the three cut out slices explains the red and blue areas in the difference image created by lasoverlap. We find a small systematic error in two of the slices. In slice ‘clip2500_3160_100x005.laz‘ the green points from flightline 1 are on average slightly higher than the red points from flightline 2. Vice-versa in slice ‘clip2450_3450_100x005.laz‘ the green points from flightline 1 are on average slightly lower than the red points from flightline 2. However, the error is less than half a foot and only happens near the edges of the flightlines. Given that our surfaces are expected to be “fluffy” anyways (as is typical for Velodyne LiDAR systems), we accept these differences and continue processing.

The strong red and blue colors in the center of the difference image created by lasoverlap is easily explained by looking at slice ‘clip2430_3290_100x020.laz‘. The scanner was “looking” under a gazebo-like open roof structure from two different directions and therefore always seeing parts of the floor in one flightline that were obscured by the roof in the other.

While working with this data we’ve also implemented a new feature for lastrack that computes the 3D distance between LiDAR points and the trajectory and allows storing the result as an additional per point attribute with extra bytes. Those can then be visualized with lasgrid. Here an example:

lastrack -i flightline1.laz ^
         -i flightline2.laz ^
         -track Velodyne001_above719_012.laz ^
         -store_xyz_range_as_extra_bytes ^
         -odix _xyz_range -olaz ^
         =cores 2

lasgrid -i flightline*_xyz_range.laz -merged ^
        -drop_attribute_below 0 1 ^
        -attribute0 -lowest ^
        -false -set_min_max 20 200 ^
        -o quality/closest_xyz_range_020ft_200ft.png

lasgrid -i flightline*_xyz_range.laz -merged ^
        -drop_attribute_below 0 1 ^
        -attribute0 -highest ^
        -false -set_min_max 30 300 ^
        -o quality/farthest_xyz_range_030ft_300ft.png

Below the complete processing pipeline for creating a median ground model from the “fluffy” Velodyne LiDAR data that results in the hillshaded DTM shown here. The workflow is similar to those we have developed in earlier blog posts for Velodyne Puck based systems like the Hovermap and the Yellowscan.

Hillshaded DTM with a resolution of 1 foot generated via a median ground computation by the LAStools processing pipeline detailed below.

lastile -i flightline1.laz ^
        -i flightline2.laz ^
        -faf ^
        -tile_size 250 -buffer 25 -flag_as_withheld ^
        -odir tiles_raw -o somer.laz

lasnoise -i tiles_raw\*.laz ^
         -step_xy 2 -step 1 -isolated 9 ^
         -odir tiles_denoised -olaz ^
          -cores 4

lasthin -i tiles_denoised\*.laz ^
        -ignore_class 7 ^
        -step 1 -percentile 0.05 10 -classify_as 8 ^
        -odir tiles_thinned_1_foot -olaz ^
        -cores 4

lasthin -i tiles_thinned_1_foot\*.laz ^
        -ignore_class 7 ^
        -step 2 -percentile 0.05 10 -classify_as 8 ^
        -odir tiles_thinned_2_foot -olaz ^
        -cores 4

lasthin -i tiles_thinned_2_foot\*.laz ^
        -ignore_class 7 ^
        -step 4 -percentile 0.05 10 -classify_as 8 ^
        -odir tiles_thinned_4_foot -olaz ^
        -cores 4

lasthin -i tiles_thinned_4_foot\*.laz ^
        -ignore_class 7 ^
        -step 8 -percentile 0.05 10 -classify_as 8 ^
        -odir tiles_thinned_8_foot -olaz ^
        -cores 4

lasground -i tiles_thinned_8_foot\*.laz ^
          -ignore_class 1 7 ^
          -town -extra_fine ^
          -odir tiles_ground_lowest -olaz ^
          -cores 4

lasheight -i tiles_ground_lowest\*.laz ^
          -classify_between -0.05 0.5 6 ^
          -odir tiles_ground_thick -olaz ^
          -cores 4

lasthin -i tiles_ground_thick\*.laz ^
        -ignore_class 1 7 ^
        -step 1 -percentile 0.5 -classify_as 2 ^
        -odir tiles_ground_median -olaz ^
        -cores 4

las2dem -i tiles_ground_median\*.laz ^
        -keep_class 2 ^
        -step 1 -use_tile_bb ^
        -odir tiles_dtm -obil ^
        -cores 4

blast2dem -i tiles_dtm\*.bil -merged ^
          -step 1 -hillshade ^
          -o dtm_hillshaded.png

We thank Harris Aerial for sharing this LiDAR data set with us flown by their Carrier H4 Hybrid HE drone carrying a Snoopy A series LiDAR system from LidarUSA.

Three Videos from Full Day Workshop on LiDAR at IIST Trivandrum in India

Three videos from a full day workshop on LiDAR processing at the Department of Earth and Space Sciences of the Indian Institute of Space Science and Technology (IIST) in Thiruvananthapuram, Kerala, India held in October 2017 that was organized by Dr. A. M. Ramiya who we thank very much for the invitation and for being such a kind host. After our usual introduction to LiDAR and LAStools we use three raw flightlines from Ayutthaya, Thailand as example data to perform a complete LiDAR workflow including

  1. LiDAR quality checking such as pulse density, coverage, and flightline alignment
  2. LiDAR preparation (compressing, tiling, denoising, classifying)
  3. LiDAR derivative creation (DTM and DSM rasters, contours, building and vegetation footprints)

You can download the three flightlines used in the tutorial here (line2, line3, line4) to follow along.

morning video

after lunch video

after tea video

First Look with LAStools at LiDAR from Hovermap Drone by CSIRO

Last December we had a chance to visit the team of Dr. Stefan Hrabar at CSIRO in Pullenvale near Brisbane who work on a drone LiDAR system called Hovermap. This SLAM-based system is mainly developed for the purpose of autonomous flight and exploration of GPS-denied environments such as buildings, mines and tunnels. But as the SLAM algorithm continuously self-registers the scan lines it produces a LiDAR point cloud that in itself is a nice product. We started our visit with a short test flight around the on-site tower. You can download the LiDAR data and the drone trajectory of this little survey here:

The Hovermap system is based on the Velodyne Puck Lite (VLP-16) that is much cheaper and more light-weight than many other LiDAR systems. One interesting tidbit in the Hovermap setup is that the scanner is installed such that the entire Puck is constantly rotating as you can see in this video. But  the Velodyne Puck is also known to produce somewhat “fluffy” surfaces with a thickness of a few centimeters. In a previous blog post with data from the YellowScan Surveyor system (that is also based on the Puck) we used a “median ground” surface to deal with the “fluff”. In the following we will have a look at the LiDAR data produced by Hovermap and how to process it with LAStools.

LiDAR data of CSIRO tower acquired during test flight of Hovermap system.

As always we start with a lasinfo report that computes the average density ‘-cd’ and histograms for the intensity and the GPS time:

lasinfo -i CSIRO_Tower\results.laz ^
        -cd ^
        -histo intensity 16 -histo gps_time 2 ^
        -odir CSIRO_Tower\quality -odix _info -otxt

A few excerpts of the resulting lasinfo report that you can download here are below:

lasinfo (180409) report for 'CSIRO_Tower\results.laz'
[...]
 number of point records: 16668904
 number of points by return: 0 0 0 0 0
 scale factor x y z: 0.0001 0.0001 0.0001
 offset x y z: -5.919576153930379 22.785394470724583 9.535698734939086
 min x y z: -138.6437 -125.2552 -34.1510
 max x y z: 126.8046 170.8260 53.2224
WARNING: full resolution of min_x not compatible with x_offset and x_scale_factor: -138.64370561381907
WARNING: full resolution of min_y not compatible with y_offset and y_scale_factor: -125.25518631070418
WARNING: full resolution of min_z not compatible with z_offset and z_scale_factor: -34.150966206894068
WARNING: full resolution of max_x not compatible with x_offset and x_scale_factor: 126.80455330595831
WARNING: full resolution of max_y not compatible with y_offset and y_scale_factor: 170.82597525215334
WARNING: full resolution of max_z not compatible with z_offset and z_scale_factor: -34.150966206894068
[...]
 gps_time 121.288045 302.983110
WARNING: 2 points outside of header bounding box
[...]
covered area in square units/kilounits: 51576/0.05
point density: all returns 323.19 last only 318.40 (per square units)
 spacing: all returns 0.06 last only 0.06 (in units)
WARNING: for return 1 real number of points by return is 16424496 but header entry was not set.
WARNING: for return 2 real number of points by return is 244408 but header entry was not set.
[...]
real max z larger than header max z by 0.000035
real min z smaller than header min z by 0.000035
[...]

Most of these warnings have to do with poorly chosen offset values in the LAS header that have many decimal digits instead of being nice round numbers. The points are stored with sub-millimeter resolution (scale factors of 0.0001) which is unnecessarily precise for a UAV flying a Velodyne Puck where the overall system error can be expected to be on the order of a few centimeters. Also the histogram of return numbers in the LAS header was not populated. We can fix these issues with one call to las2las:

las2las -i CSIRO_Tower\results.laz ^
        -rescale 0.01 0.01 0.01 ^
        -auto_reoffset ^
        -odix _fixed -olaz

If you create another lasinfo report on the fixed file you will see that all the warnings have gone. The file size is now also only 102 MB instead of 142 MB because centimeter coordinate compress much better than sub-millimeter coordinates.

The average density of 318 last return per square meter reported by lasinfo is not that useful for a UAV survey because it does account for the highly varying distribution of LiDAR returns in the area surveyed. With lasgrid we can get a much more clear picture of that.

lasgrid -i CSIRO_Tower\results_fixed.laz ^
        -last_only ^
        -step 0.5 -use_bb -density ^
        -false -set_min_max 0 1500 ^
        -o CSIRO_Tower\quality\density_0_1500.png

LiDAR density: blue is close to zero and red is 1500 or more last returns / sqr mtr

The red dot in the point density indicated an area with over 1500 last returns per square meter. No surprise that this is the take-off and touch-down location of the copter drone. Naturally this spot is completely over-scanned compared to the rest of the area. We can remove these points with the help of the timestamps by cutting off the start and the end of the recording.

The total recording time including take-off, flight around the tower, and touch-down was around 180 seconds or 3 minutes as the lasinfo report tells us. Note that the recorded time stamps are neither “GPS Week Time” nor “Adjusted Standard GPS Time” but an internal system time. By visualizing the trajectory of the UAV with lasview while binning the timestamps into the intensity field we can easily determine what interval of timestamps describes the actual survey flight. First we convert the drone trajectory from the textual ASCII format to the LAZ format with txt2las:

txt2las -i CSIRO_Tower\results_traj.txt ^
        -skip 1 ^
        -parse txyz ^
        -set_classification 12 ^
        -olaz

lasview -i CSIRO_Tower\results_traj.laz ^
        -bin_gps_time_into_intensity 1

Binning timestamps into intensity allows visually determining start and end of survey.

Using lasview and pressing <i> while hovering over those points of the trajectory that appear to be the survey start and end we determine visually that the timestamps between 164 to 264 correspond to the actual survey flight over the area of interest with the take-off and touch-down maneuvers excluded. We use las2las to cut out the relevant part and re-run lasgrid:

las2las -i CSIRO_Tower\results_fixed.laz ^
        -keep_gps_time 164 264 ^
        -o CSIRO_Tower\results_survey.laz

lasgrid -i CSIRO_Tower\results_survey.laz ^
        -last_only ^
        -step 0.5 -use_bb -density ^
        -false -set_min_max 0 1500 ^
        -o CSIRO_Tower\quality\density_0_1500_survey.png

LiDAR density after removing take-off and touch-down maneuvers.

The other set of point we are less interested in are those occasional hits far from the scanner that sample the area too sparsely to be useful for anything. We use lastrack to reclassify points as noise (7) that exceed a x/y distance of 50 meters from the trajectory and then use lasgrid to create another density image without the points classified as noise..

lastrack -i CSIRO_Tower\results_survey.laz ^
         -track CSIRO_Tower\results_traj.laz ^
         -classify_xy_range_between 50 1000 7 ^
         -o CSIRO_Tower\results_xy50.laz

lasgrid -i CSIRO_Tower\results_xy50.laz ^
        -last_only -keep_class 0 ^
        -step 0.5 -use_bb -density ^
        -false -set_min_max 0 1500 ^
        -o CSIRO_Tower\quality\density_0_1500_xy50.png

LiDAR density after removing returns farther than 50 m from trajectory.

We process the remaining points using a typical tile-based processing pipeline. First we run lastile to create tiling of 200 meter by 200 meter tiles with 20 buffers while dropping the noise points::

lastile -i CSIRO_Tower\results_xy50.laz ^
        -drop_class 7 ^
        -tile_size 200 -buffer 20 -flag_as_withheld ^
        -odir CSIRO_Tower\tiles_raw -o eta.laz

Because of the high sampling we expect there to be quite a few duplicate point where all three coordinate x, y, and z are identical. We remove them with a call to lasduplicate:

lasduplicate -i CSIRO_Tower\tiles_raw\*.laz ^
             -unique_xyz ^
             -odir CSIRO_Tower\tiles_unique -olaz ^
             -cores 4

This removes between 12 to 25 thousand point from each tile. Then we use lasnoise to classify isolated points as noise:

lasnoise -i CSIRO_Tower\tiles_unique\*.laz ^
         -step_xy 0.5 -step_z 0.1 -isolated 5 ^
         -odir CSIRO_Tower\tiles_denoised_temp -olaz ^
         -cores 4

Aggressive parameters assure most noise point below ground are found.

This classifies between 13 to 23 thousand point from each tile into the noise classification code 7. We use rather aggressive settings to make sure we get most of the noise points that are below the terrain. Getting a correct ground classification in the next few steps is the main concern now even if this means that many points above the terrain on wires, towers, or vegetation will also get miss-classified as noise (at least temporarily). Next we use lasthin to classify a subset of points with classification code 8 on which we will then run the ground classification. We classify each point that is closest to the 5th percentile in elevation per 25 cm by 25 cm grid cell given there are at least 20 non-noise points in a cell. We then repeat this while increasing the cell size to 50 cm by 50 cm and 100 cm by 100 cm.

lasthin -i CSIRO_Tower\tiles_denoised_temp\*.laz ^
        -ignore_class 7 ^
        -step 0.25 -percentile 5 20 -classify_as 8 ^
        -odir CSIRO_Tower\tiles_thinned_025 -olaz ^
        -cores 4

lasthin -i CSIRO_Tower\tiles_thinned_025\*.laz ^
        -ignore_class 7 ^
        -step 0.50 -percentile 5 20 -classify_as 8 ^
        -odir CSIRO_Tower\tiles_thinned_050 -olaz ^
        -cores 4

lasthin -i CSIRO_Tower\tiles_thinned_025\*.laz ^
        -ignore_class 7 ^
        -step 1.00 -percentile 5 20 -classify_as 8 ^
        -odir CSIRO_Tower\tiles_thinned_100 -olaz ^
        -cores 4

 

Then we ground classify the points that were classified into the temporary classification code 8 in the previous step using lasground.

lasground -i CSIRO_Tower\tiles_thinned_100\*.laz ^
          -ignore_class 7 0 ^
          -town -ultra_fine ^
          -odir CSIRO_Tower\tiles_ground -olaz ^
          -cores 4

The resulting ground points are a lower envelope of the “fluffy” sampled surfaces produced by the Velodyne Puck scanner. We use lasheight to thicken the ground by moving all points between 1 cm below and 6 cm above the TIN of these “low ground” points to a temporary classification code 6 representing a “thick ground”. We also undo the overly aggressive noise classifications above the ground by setting all higher points back to classification code 1 (unclassified).

lasheight -i CSIRO_Tower\tiles_ground\*.laz ^
          -classify_between -0.01 0.06 6 ^
          -classify_above 0.06 1 ^
          -odir CSIRO_Tower\tiles_ground_thick -olaz ^
          -cores 4

Profile view for 25 centimeter wide strip of open terrain. Top: Green points are low ground. Orange points are thickened ground with 5 cm drop lines. Middle: Brown points are median ground computed from thick ground. Bottom: Comparing low ground points (in green) with median ground points (in brown).

From the “thick ground” we then compute a “median ground” using lasthin in a similar fashion as we used it before. A profile view for a 25 centimeter wide strip of open terrain illustrates the workflow of going from “low ground” via “thick ground” to “median ground” and shows the slight difference in elevation between the two.

lasthin -i CSIRO_Tower\tiles_ground_thick\*.laz ^
        -ignore_class 0 1 7 ^
        -step 0.25 -percentile 50 10 -classify_as 2 ^
        -odir CSIRO_Tower\tiles_ground_median_025 -olaz ^
        -cores 4

lasthin -i CSIRO_Tower\tiles_ground_median_025\*.laz ^
        -ignore_class 0 1 7 ^
        -step 0.50 -percentile 50 10 -classify_as 2 ^
        -odir CSIRO_Tower\tiles_ground_median_050 -olaz ^
        -cores 4

lasthin -i CSIRO_Tower\tiles_ground_median_050\*.laz ^
        -ignore_class 0 1 7 ^
        -step 1.00 -percentile 50 10 -classify_as 2 ^
        -odir CSIRO_Tower\tiles_ground_median_100 -olaz ^
        -cores 4

Then we use lasnoise once more with more conservative settings to remove the noise points that are sprinkled around the scene.

lasnoise -i CSIRO_Tower\tiles_ground_median_100\*.laz ^
         -step_xy 1.0 -step_z 1.0 -isolated 5 ^
         -odir CSIRO_Tower\tiles_denoised -olaz ^
         -cores 4

While we classify the scene into building roofs, vegetation, and everything else with lasclassify we also move all (unused) classifications to classification code 1 (unclassified). You may play with the parameters of lasclassify (see README) to achieve better a building classification. However, those buildings the laser can peek into (either via a window or because they are gazebo-like structures) will not be classified correctly. unless you remove the points that are under the roof somehow.

lasclassify -i CSIRO_Tower\tiles_denoised\*.laz ^
            -ignore_class 7 ^
            -change_classification_from_to 0 1 ^
            -change_classification_from_to 6 1 ^
            -step 1 ^
            -odir CSIRO_Tower\tiles_classified -olaz ^
            -cores 4

A glimpse at the final classification result is below. A hillshaded DTM and a strip of classified points. Of course the tower was miss-classified as vegetation given that it looks just like a tree to the logic used in lasclassify.

The hillshaded DTM with a strip of classified points.

Finally we remove the tile buffers (that were really important for tile-based processing) with lastile:

lastile -i CSIRO_Tower\tiles_classified\*.laz ^
        -remove_buffer ^
        -odir CSIRO_Tower\tiles_final -olaz ^
        -cores 4

And publish the LiDAR point cloud as version 1.6 of Potree using laspublish:

laspublish -i CSIRO_Tower\tiles_final\*.laz ^
           -i CSIRO_Tower\results_traj.laz ^
           -only_3D -elevation -overwrite -potree16 ^
           -title "CSIRO Tower" ^
           -description "HoverMap test flight, 18 Dec 2017" ^
           -odir CSIRO_Tower\tiles_portal -o portal.html -olaz

Note that we also added the trajectory of the drone because it looks nice and gives a nice illustration of how the UAV was scanning the scene.

Via Potree we can publish and explore the final point cloud using any modern Web browser.

We would like to thank the entire team around Dr. Stefan Hrabar for taking time out of their busy schedules just a few days before Christmas.

Removing Low Noise from RIEGL’s VUX-1 UAV LiDAR flown in the Philippines

In this tutorial we are removing some “tricky” low noise from LiDAR point clouds in order to produce a high-resolution Digital Terrain Model (DTM). The data was flown above a tropical beach and mangrove area in the Philippines using a VUX-1 UAV based system from RIEGL mounted on a helicopter. The survey was done as a test flight by AB Surveying who have the capacity to fly such missions in the Philippines and who have allowed us to share this data with you for educational purposes. You can download the data (1 GB) here. It covers a popular twin beach knows as “Nacpan” near El Nido in Palawan (that we happen to have visited in 2014).

A typical beach fringed by coconut palms in Palawan, Philippines.

We start our usual quality check with a run of lasinfo. We add the ‘-cd’ switch to compute an average point density and the ‘-histo gps_time 1’ switch to produce a 1 second histogram for the GPS time stamps.

lasinfo -i lalutaya.laz ^
        -cd ^
        -histo gps_time 1 ^
        -odix _info -otxt

You can download the resulting lasinfo report here. It tells us that there are 118,740,310 points of type 3 (with RGB colors) with an average density of 57 last returns per square meter. The point coordinates are in the “PRS92 / Philippines 1” projection with EPSG code 3121 that is based on the “Clarke 1866” ellipsoid.

Datum Transform

We prefer to work in an UTM projection based on the “WGS 1984” ellipsoid, so we will first perform a datum transform based on the seven parameter Helmert transformation – a capacity that was recently added to LAStools. For this we first need a transform to get to geocentric or Earth-Centered Earth-Fixed (ECEF) coordinates on the current “Clarke 1866” ellipsoid, then we apply the Helmert transformation that operates on geocentric coordinates and whose parameters are listed in the TOWGS84 string of EPSG code 3121 to get to geocentric or ECEF coordinates on the “WGS 1984” ellipsoid. Finally we can convert the coordinates to the respective UTM zone. These three calls to las2las accomplish this.

las2las -i lalutaya.laz ^
        -remove_all_vlrs ^
        -epsg 3121 ^
        -target_ecef ^
        -odix _ecef_clark1866 -olaz

las2las -i lalutaya_ecef_clark1866.laz ^
        -transform_helmert -127.62,-67.24,-47.04,-3.068,4.903,1.578,-1.06 ^
        -wgs84 -ecef ^
        -ocut 10 -odix _wgs84 -olaz
 
las2las -i lalutaya_ecef_wgs84.laz ^
        -target_utm auto ^
        -ocut 11 -odix _utm -olaz

In these steps we implicitly reduced the resolution that each coordinate was stored with from quarter-millimeters (i.e. scale factors of 0.00025) to the default of centimeters (i.e. scale factors of 0.01) which should be sufficient for subsequent vegetation analysis. The LAZ files also compress better when coordinates a lower resolution so that the ‘lalutaya_utm.laz’ file is over 200 MB smaller than the original ‘lalutaya.laz’ file. The quantization error this introduces is probably still below the overall scanning error of this helicopter survey.

Flightline Recovery

Playing back the file visually with lasview suggests that it contains more than one flightline. Unfortunately the point source ID field of the file is not properly populated with flightline information. However, when scrutinizing the GPS time stamp histogram in the lasinfo report we can see an occasional gap. We highlight two of these gaps in red between GPS second 540226 and 540272 and GPS second 540497 and 540556 in this excerpt from the lasinfo report:

gps_time histogram with bin size 1
[...]
 bin 540224 has 125768
 bin 540225 has 116372
 bin 540226 has 2707
 bin 540272 has 159429
 bin 540273 has 272049
 bin 540274 has 280237
[...]
 bin 540495 has 187103
 bin 540496 has 180421
 bin 540497 has 126835
 bin 540556 has 228503
 bin 540557 has 275025
 bin 540558 has 273861
[...]

We can use lasplit to recover the original flightlines based on gaps in the continuity of GPS time stamps that are bigger than 10 seconds:

lassplit -i lalutaya_utm.laz ^
         -recover_flightlines_interval 10 ^
         -odir strips_raw -o lalutaya.laz

This operation splits the points into 11 separate flightlines. The points within each flightline are stored in the order that the vendor software – which was RiPROCESS 1.7.2 from RIEGL according to the lasinfo report – had written them to file. We can use lassort to bring them back into the order they were acquired in by sorting first on the GPS time stamp and then on the return number field:

lassort -i strips_raw\*.laz ^
        -gps_time -return_number ^
        -odir strips_sorted -olaz ^
        -cores 4

Now we turn the sorted flightlines into tiles (with buffers !!!) for further processing. We also erase the current classification of the data into ground (2) and medium vegetation (4) as a quick visual inspection with lasview immediately shows that those are not correct:

lastile -i strips_sorted\*.laz ^
        -files_are_flightlines ^
        -set_classification 0 ^
        -tile_size 250 -buffer 30 -flag_as_withheld ^
        -odir tiles_raw -o lalu.laz

Quality Checking

Next comes the standard check of flightline overlap and alignment check with lasoverlap. Once more it become clear why it is so important to have flightline information. Without we may have missed what we are about to notice. We create false color images load into Google Earth to visually assess the situation. We map all absolute differences between flightlines below 5 cm to white and all absolute differences above 30 cm to saturated red (positive) or blue (negative) with a gradual shading from white to red or blue for any differences in between. We also create an overview KML file that lets us quickly see in which tile we can find the points for a particular area of interest with lasboundary.

lasoverlap -i tiles_raw\*.laz ^
           -step 1 -min_diff 0.05 -max_diff 0.30 ^
           -odir quality -opng ^
           -cores 4

lasboundary -i tiles_raw\*.laz ^
            -use_tile_bb -overview -labels ^
            -o quality\overview.kml

The resulting visualizations show (a) that our datum transform to the WGS84 ellipsoid worked because the imagery aligns nicely with Google Earth and (b) that there are several issues in the data that require further scrutiny.

In general the data seems well aligned (most open areas are white) but there are blue and red lines crossing the survey area. With lasview have a closer look at the visible blue lines running along the beach in tile ‘lalu_765000_1252750.laz’ by repeatedly pressing ‘x’ to select a different subset and ‘x’ again to view this subset up close while pressing ‘c’ to color it differently:

lasview -i tiles_raw\lalu_765000_1252750.laz

These lines of erroneous points do not only happen along the beach but also in the middle of and below the vegetation as can be seen below:

Our initial hope was to use the higher than usual intensity of these erroneous points to reclassify them to some classification code that we would them exclude from further processing. Visually we found that a reasonable cut-off value for this tile would be an intensity above 35000:

lasview -i tiles_raw\lalu_765000_1252750.laz ^
        -keep_intensity_above 35000 ^
        -filtered_transform ^
        -set_classification 23

However, while this method seems successful on the tile shown above it fails miserably on others such as ‘lalu_764250_1251500.laz’ where large parts of the beach are very reflective and result in high intensity returns to to the dry white sand:

lasview -i tiles_raw\lalu_764250_1251500.laz ^
        -keep_intensity_above 35000 ^
        -filtered_transform ^
        -set_classification 23

Low Noise Removal

In the following we describe a workflow that can remove the erroneous points below the ground so that we can at least construct a high-quality DTM from the data. This will not, however, remove the erroneous points above the ground so a subsequent vegetation analysis would still be affected. Our approach is based on two obervations (a) the erroneous points affect only a relatively small area and (b) different flightlines have their erroneous points in different areas. The idea is to compute a set of coarser ground points separately for each flightline and – when combining them in the end – to pick higher ground points over lower ones. The combined points should then define a surface that is above the erroneous below-ground points so that we can mark them with lasheight as not to be used for the actual ground classification done thereafter.

The new huge_las_file_extract_ground_points_only.bat example batch script that you can download here does all the work needed to compute a set of coarser ground points for each flightline. Simply edit the file such that the LAStools variable points to your LAStools\bin folder and rename it to end with the *.bat extension. Then run:

huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000001.laz strips_ground_only\lalutaya_0000001.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000002.laz strips_ground_only\lalutaya_0000001.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000003.laz strips_ground_only\lalutaya_0000001.laz
...
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000009.laz strips_ground_only\lalutaya_0000009.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000010.laz strips_ground_only\lalutaya_0000010.laz
huge_las_file_extract_ground_points_only strips_sorted\lalutaya_0000011.laz strips_ground_only\lalutaya_0000011.laz

The details on how this batch script works – a pretty standard tile-based multi-core processing workflow – are given as comments in this batch script. Now we have a set of individual ground points computed separately for each flightline and some will include erroneous points below the ground that the lasground algorithm by its very nature is likely to latch on to as you can see here:

The trick is now to utilize the redundancy of multiple scans per area and – when combining flightlines – to pick higher rather than lower ground points in overlap areas by using the ground point closest to the 75th elevation percentile per 2 meter by 2 meter area with at least 3 or more points with lasthin:

lasthin -i strips_ground_only\*.laz -merged ^
        -step 2 -percentile 75 3 ^
        -o lalutaya_ground_only_2m_75_3.laz

There are still some non-ground points in the result as ground-classifying of flightlines individually often results in vegetation returns being included in sparse areas along the edges of the flight lines but we can easily get rid of those:

lasground_new -i lalutaya__ground_only_2m_75_3.laz ^
              -town -hyper_fine ^
              -odix _g -olaz

We sort the remaining ground points into a space-filling curve order with lassort and spatially index them with lasindex so they can be efficiently accessed by lasheight in the next step.

lassort -i lalutaya__ground_only_2m_75_3_g.laz ^
        -keep_class 2 ^
        -o lalutaya_ground.laz

lasindex -i lalutaya_ground.laz

Finally we have the means to robustly remove the erroneous points below the ground from all tiles. We use lasheight with the ground points we’ve just so painstakingly computed to classify all points 20 cm or more below the ground surface they define into classification code 23. Later we simply can ignore this classification code during processing:

lasheight -i tiles_raw\*.laz ^
          -ground_points lalutaya_ground.laz ^
          -do_not_store_in_user_data ^
          -classify_below -0.2 23 ^
          -odir tiles_cleaned -olaz ^
          -cores 4

Rather than trying to ground classify all remaining points we run lasground on a thinned subset of all points. For this we mark the lowest point in every 20 cm by 20 cm grid cell with some temporary classification code such as 6.

lasthin -i tiles_cleaned\*.laz ^
        -ignore_class 23 ^
        -step 0.20 -lowest -classify_as 6 ^
        -odir tiles_thinned -olaz ^
        -cores 4

Finally we can run lasground to compute the ground classification considering all points with classification code 6 by ignoring all points with classification codes 23 and 0.

lasground_new -i tiles_thinned\*.laz ^
              -ignore_class 23 0 ^
              -city -hyper_fine ^
              -odir tiles_ground_new -olaz ^
              -cores 4

And finally we can create a DTM with a resolution of 25 cm using las2dem and the result is truly beautiful:

las2dem -i tiles_ground_new\*.laz ^
        -keep_class 2 ^
        -step 0.25 -use_tile_bb ^
        -odir tiles_dtm_25cm -obil ^
        -cores 4

We have to admit that a few bumps are left (see mouse cursor below) but adjusting the parameters presented here is left as an exercise to the reader.

We would again like to acknowledge AB Surveying whose generosity has made this blog article possible. They have the capacity to fly such missions in the Philippines and who have allowed us to share this data with you for educational purposes.

Scotland’s LiDAR goes Open Data (too)

Following the lead of England and Wales, the Scottish LiDAR is now also open data. The implementation of such an open geospatial policy in the United Kingdom was spear-headed by the Environment Agency of England who started to make all of their LiDAR holdings available as open data. In September 2015 they opened DTM and DSM raster derivatives down to 25 cm resolution and in March 2016 also the raw point clouds went online our compressed and open LAZ format (more info here) – all with the very permissible Open Government Licence v3. This treasure cove of geospatial data was collected by the Environment Agency Geomatics own survey aircraft mainly for flood mapping purposes. The data that had been access restricted for the past 17 years of operation and was made open only after it was shown that restricting access in order to recover costs to finance future operations – a common argument for withholding tax-payer funded data – was nothing but an utter myth. This open data policy has resulted in an incredible re-use of the LiDAR and the Environment Agency has literally been propelled into the role of a “champion for open data” inspiring Wales (possibly the German states of North-Rhine Westfalia and Thuringia) and now also Scotland to open up their geospatial archives as well …

Huge LAS files available for download from the Scottish Open Data portal.

We went to the nice online portal of Scotland to download three files from the Phase II LiDAR for Scotland that are provided as uncompressed LAS files, namely LAS_NN45NE.las, LAS_NN55NE.las, and LAS_NN55NW.las, whose sizes are listed as 1.2 GB, 2.8 GB, and 4.7 GB in the screenshot above. Needless to say that it took quite some time and several restarts (using wget with option ‘-c’) to successfully download these very large LAS files.

laszip -i LAS_NN45NE.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN45NE.las -odix _mm -olaz
laszip -i LAS_NN55NE.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN55NE.las -odix _mm -olaz
laszip -i LAS_NN55NW.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN55NW.las -odix _mm -olaz

After downloading we decided to see how well these files compress with LASzip by running the six commands shown above creating LAZ files when re-scaling of coordinate resolution to centimeter (cm) and LAZ files with the original millimeter (mm) coordinate resolution (i.e. the original scale factors are 0.001 which is somewhat excessive for aerial LiDAR where the error in position per coordinate is typically between 5 cm and 20 cm). Below you see the resulting file sizes for the three different files.

 1,164,141,247 LAS_NN45NE.las
   124,351,690 LAS_NN45NE_cm.laz (1 : 9.4)
   146,651,719 LAS_NN45NE_mm.laz (1 : 7.9)
 2,833,123,863 LAS_NN55NE.las
   396,521,115 LAS_NN55NE_cm.laz (1 : 7.1)
   474,767,495 LAS_NN55NE_mm.laz (1 : 6.0)
 4,664,782,671 LAS_NN55NW.las
   531,454,473 LAS_NN55NW_cm.laz (1 : 8.8)
   629,141,151 LAS_NN55NW_mm.laz (1 : 7.4)

The savings in download time and storage space of storing the LiDAR in LAZ versus LAS are sixfold to tenfold. If I was a tax payer in Scotland and if my government was hosting open data on in the Amazon cloud (i.e. paying for AWS cloud services with my taxes) I would encourage them to store their data in a more compressed format. Some more details on the data.

According to the provided meta data, the Scottish Public Sector LiDAR Phase II dataset was commissioned by the Scottish Government in response to the Flood Risk Management Act (2009). The project was managed by Sniffer and the contract was awarded to Fugro BKS. Airborne LiDAR data was collected for 66 sites (the dataset does not have full national coverage) totaling 3,516 km^2 between 29th November 2012 and 18th April 2014. The point density was a minimum of 1 point/sqm, and approximately 2 points/sqm on average. A DTM and DSM were produced from the point clouds, with 1m spatial resolution. The Coordinate reference system is OSGB 1936 / British National Grid (EPSG code 27700). The data is licensed under an Open Government Licence. However, under the use constraints section it now only states that the following attribution statement must be used to acknowledge the source of the information: “Copyright Scottish Government and SEPA (2014)” but also that Fugro retain the commercial copyright, which is somewhat disconcerting and may require more clarification. According to this tweet a lesser license (NCGL) applies to the raw LiDAR point clouds. Below a lasinfo report for the large LAS_NN55NW.las as well as several visualizations with lasview.

lasinfo (170915) report for LAS_NN55NW.las
reporting all LAS header entries:
 file signature: 'LASF'
 file source ID: 0
 global_encoding: 1
 project ID GUID data 1-4: 00000000-0000-0000-0000-000000000000
 version major.minor: 1.2
 system identifier: 'Riegl LMS-Q'
 generating software: 'Fugro LAS Processor'
 file creation day/year: 343/2016
 header size: 227
 offset to point data: 227
 number var. length records: 0
 point data format: 1
 point data record length: 28
 number of point records: 166599373
 number of points by return: 149685204 14102522 2531075 280572 0
 scale factor x y z: 0.001 0.001 0.001
 offset x y z: 250050 755050 270
 min x y z: 250000.000 755000.000 203.731
 max x y z: 254999.999 759999.999 491.901
reporting minimum and maximum for all LAS point record entries ...
 X -50000 4949999
 Y -50000 4949999
 Z -66269 221901
 intensity 39 2046
 return_number 1 4
 number_of_returns 1 4
 edge_of_flight_line 0 1
 scan_direction_flag 1 1
 classification 1 11
 scan_angle_rank -30 30
 user_data 0 3
 point_source_ID 66 91
 gps_time 38230669.389034 38402435.753789
number of first returns: 149685204
number of intermediate returns: 2813604
number of last returns: 149687616
number of single returns: 135599244
overview over number of returns of given pulse: 135599244 23122229 6754118 1123782 0 0 0
histogram of classification of points:
 287819 unclassified (1)
 109019874 ground (2)
 14476880 low vegetation (3)
 3487218 medium vegetation (4)
 39141518 high vegetation (5)
 165340 building (6)
 13508 rail (10)
 7216 road surface (11)

Kudos to the Scottish government for opening their data. We hereby acknowledge the source of the LiDAR that we have used in the experiments above as “Copyright Scottish Government and SEPA (2014)”.