Removing Excessive Low Noise from Dense-Matching Point Clouds

Point clouds produced with dense-matching by photogrammetry software such as SURE, Pix4D, or Photoscan can include a fair amount of the kind of “low noise” as seen below. Low noise causes trouble when attempting to construct a Digital Terrain Model (DTM) from the points as common algorithm for classifying points into ground and non-ground points – such as lasground – tend to “latch onto” those low points, thereby producing a poor representation of the terrain. This blog post describes one possible LAStools workflow for eliminating excessive low noise. It was developed after a question in the LAStools user forum by LASmoons holder Muriel Lavy who was able to share her noisy data with us. See this, this, this, thisthis, and this blog post for further reading on this topic.

Here you can download the dense matching point cloud that we are using in the following work flow:

We leave the usual inspection of the content with lasinfolasview, and lasvalidate that we always recommend on newly obtained data as an exercise to the reader. Note that a check for proper alignment of flightlines with lasoverlap that we consider mandatory for LiDAR data is not applicable for dense-matching points.

With lastile we turn the original file with 87,261,083 points into many smaller 500 by 500 meter tiles for efficient multi-core processing. Each tile is given a 25 meter buffer to avoid edge artifacts. The buffer points are marked as withheld for easier on-the-fly removal. We add a (terser) description of the WGS84 UTM zone 32N to each tile via the corresponding EPSG code 32632:
lastile -i muriel\20161127_Pancalieri_UTM.laz ^
        -tile_size 500 -buffer 25 -flag_as_withheld ^
        -epsg 32632 ^
        -odir muriel\tiles_raw -o panca.laz
Because dense-matching points often have a poor point order in the files they get delivered in we use lassort to rearrange them into a space-filling curve order as this will speed up most following processing steps:
lassort -i muriel\tiles_raw\panca*.laz ^
        -odir muriel\tiles_sorted -olaz ^
        -cores 7
We then run lasthin to reclassify the highest point of every 2.5 by 2.5 meter grid cell with classification code 8. As the spacing of the dense-matched points is around 40 cm in both x and y, around 40 points will fall into each such grid cell from which the highest is then classified as 8:
lasthin -i muriel\tiles_sorted\panca*.laz ^
        -step 2.5 ^
        -highest -classify_as 8 ^
        -odir muriel\tiles_thinned -olaz ^
        -cores 7
Considering only those points classified as 8 in the last step we then run lasnoise to find points that are highly isolated in wide and flat neighborhoods that are then reclassified as 7. See the README file of lasnoise for a detailed explanation of the different parameters:
lasnoise -i muriel\tiles_thinned\panca*.laz ^
         -ignore_class 0 ^
         -step_xy 5 -step_z 0.1 -isolated 4 ^
         -classify_as 7 ^
         -odir muriel\tiles_isolated -olaz ^
         -cores 7
Now we run a temporary ground classification of only (!!!) on those points that are still classified as 8 using the default parameters of lasground. Hence we only use the points that were the highest points on the 2.5 by 2.5 meter grid and that were not classified as noise in the previous step. See the README file of lasground for a detailed explanation of the different parameters:
lasground -i muriel\tiles_isolated\panca*.laz ^
          -city -ultra_fine -ignore_class 0 7 ^
          -odir muriel\tiles_temp_ground -olaz ^
          -cores 7
The result of this temporary ground filtering is then merely used to mark all points that are 0.5 meter below the triangulated TIN of these temporary ground points with classification code 12 using lasheight. See the README file of lasheight for a detailed explanation of the different parameters:
lasheight -i muriel\tiles_temp_ground\panca*.laz ^
          -do_not_store_in_user_data ^
          -classify_below -0.5 12 ^
          -odir muriel\tiles_temp_denoised -olaz ^
          -cores 7
In the resulting tiles the low noise (but also many points above the ground) are now marked and in a final step we produce properly classified denoised tiles by re-mapping the temporary classification codes to conventions that are more consistent with the ASPRS LAS specification using las2las:
las2las -i muriel\tiles_temp_denoised\panca*.laz ^
        -change_classification_from_to 1 0 ^
        -change_classification_from_to 2 0 ^
        -change_classification_from_to 7 0 ^
        -change_classification_from_to 12 7 ^
        -odir muriel\tiles_denoised -olaz ^
        -cores 7
Let us visually check what each of the above steps has produced by zooming in on a 300 meter by 100 meter strip of points with the bounding box (388500,4963125) to (388800,4963225) in tile ‘panca_388500_4963000.laz’:
The final classification of all points that are not already classified as noise (7) into ground (2) or non-ground (1) was done with a final run of lasground. See the README file of lasground for a detailed explanation of the different parameters:
lasground -i muriel\tiles_denoised\panca*.laz ^
          -ignore_class 7 ^
          -city -ultra_fine ^
          -odir muriel\tiles_ground -olaz ^
          -cores 7
Then we create a seamless hill-shaded DTM tiles by triangulating all the points classified as ground into a temporary TIN (including those in the 25 meter buffer) and then rasterizing only the inner 500 meter by 500 meter of each tile with option ‘-use_tile_bb’ of las2dem. For more details on the importance of buffers in tile-based processing see this blog post here.
las2dem -i muriel\tiles_ground\panca*.laz ^
        -keep_class 2 ^
        -step 1 -hillshade ^
        -use_tile_bb ^
        -odir muriel\tiles_dtm -opng ^
        -cores 7

And here the original DSM side-by-side with resulting DTM after low noise removal. One dense forested area near the center of the data was not entirely removed due to the lack of ground points in this area. Integrating external ground points or manual editing with lasview are two possible way to rectify these few remaining errors …

Integrating External Ground Points in Forests to Improve DTM from Dense-Matching Photogrammetry

The biggest problem of generating a Digital Terrain Model (DTM) from the photogrammetric point clouds that are produced from aerial imagery with dense-matching software such as SURE, Pix4D, or Photoscan is dense vegetation: when plants completely cover the terrain not a single point is generated on the ground. This is different for LiDAR point clouds as the laser can even penetrate dense multi-level tropical forests. The complete lack of ground points in larger vegetated areas such as closed forests or dense plantations means that the many processing workflows for vegetation analysis that have been developed for LiDAR cannot be used for photogrammetric point clouds  … unless … well unless we are getting those missing ground points some other way. In the following we see how to integrate external ground points to generate a reasonable DTM under a dense forest with LAStools. See this, this, this, this, and this article for further reading.

Here you can download the dense matching point cloud, the manually collected ground points, and the forest stand delineating polygon that we are using in the following example work flow:

We leave the usual inspection of the content with lasinfo and lasview that we always recommend on newly obtained data as an exercise to the reader. Using las2dem and lasgrid we created the Google Earth overlays shown above to visualize the extent of the dense matched point cloud and the distribution of the manually collected ground points:

las2dem -i DenseMatching.laz ^
        -thin_with_grid 1.0 ^
        -extra_pass ^
        -step 2.0 ^
        -hillshade ^
        -odix _hill_2m -opng

lasgrid -i ManualGround.laz ^
        -set_RGB 255 0 0 ^
        -step 10 -rgb ^
        -odix _grid_10m -opng

Attempts to ground-classify the dense matching point cloud directly are futile as there are no ground points under the canopy in the heavily forested area. Therefore 558 ground points were manually surveyed in the forest of interest that are around 50 to 120 meters apart from another. We show how to integrate these points into the dense matching point cloud such that we can successfully extract bare-earth information from the data.

In the first step we “densify” the manually collected ground points by interpolating them with triangles onto a raster of 2 meter resolution that we store as LAZ points with las2dem. You could consider other interpolation schemes to “densify” the ground points, here we use simple linear interpolation to prove the concept. Due to the varying distance between the manually surveyed ground points we allow interpolating triangles with edge lengths of up to 125 meters. These triangles then also cover narrow open areas next to the forest, so we clip the interpolated ground points against the forest stand delineating polygon with lasclip to classify those points that are really in the forest as “key points” (class 8) and all others as “noise” (class 7).

las2dem -i ManualGround.laz ^
        -step 2 ^
        -kill 125 ^
        -odix _2m -olaz

lasclip -i ManualGround_2m.laz ^
        -set_classification 7 ^ 
        -poly forest.shp ^
        -classify_as 8 -interior ^
        -odix _forest -olaz

Below we show the resulting densified ground points colored by elevation that survive the clipping against the forest stand delineating polygon and were classified as “key points” (class 8). The interpolated ground points in narrow open areas next to the forest that fall outside this polygon were classified as “noise” (class 7) and are shown in violet. They will be dropped in the next step.

We then merge the dense matching points with the densified manual ground points (while dropping all the violet points marked as noise) as input to lasthin and reclassify the lowest point per 1 meter by 1 meter with a temporary code (here we use class 9 that usually refers to “water”). Only the subset of lowest points that receives the temporary classification code 9 will be used for ground classification later.

lasthin -i DenseMatching.laz ^
        -i ManualGround_2m_forest.laz ^
        -drop_class 7 ^
        -merged ^
        -lowest -step 1 -classify_as 9 ^
        -o DenseMatchingAndDensifiedGround.laz

We use the GUI of lasview to pick several interesting areas for visual inspection. The selected points load much faster when the LAZ file is spatially indexed and therefore we first run lasindex. For better orientation we also load the forest stand delineating polygon as an overlay into the GUI.

lasindex -i DenseMatchingAndDensifiedGround.laz 

lasview -i DenseMatchingAndDensifiedGround.laz -gui

We pick the area shown below that contains the target forest with manually collected and densified ground points and a forested area with only dense matching points. The difference could not be more drastic as the visualizations show.

Now we run ground classification using lasground with option ‘-town’ using only the points with the temporary code 9 by ignoring all other classifications 0 and 8 in the file. We leave the temporary classification code 9 unchanged for all the points that were not classified with “ground” code 2 so we can visualize later which ones those are.

lasground -i DenseMatchingAndDensifiedGround.laz ^
          -ignore_class 0 8 ^
          -town ^
          -non_ground_unchanged ^
          -o GroundClassified.laz

We again use the GUI of lasview to pick several interesting areas after running lasindex and again load the forest stand delineating polygon as an overlay into the GUI.

lasindex -i GroundClassified.laz 

lasview -i GroundClassified.laz -gui

We pick the area shown below that contains all three scenarios: the target forest with manually collected and densified ground points, an open area with only dense matching points, and a forested area with only dense matching points. The result is as expected: in the target forest the manually collected ground points are used as ground and in the open area the dense-matching points are used as ground. But there is no useful ground in the other forested area.

Now we can compute the heights of the points above ground for our target forest with lasheight and either replace the z elevations in the file of store them separately as “extra bytes”. Then we can compute, for example, a Canopy Height Model (CHM) that color codes the height of the vegetation above the ground with lasgrid. Of course this will only be correct in the target forest where we have “good” ground but not in the other forested areas. We also compute a hillshaded DTM to be able to visually inspect the topography of the generated terrain model.

lasheight -i GroundClassified.laz ^
          -store_as_extra_bytes ^
          -o GroundClassifiedWithHeights.laz

lasgrid -i GroundClassifiedWithHeights.laz ^
        -step 2 ^
        -highest -attribute 0 ^
        -false -set_min_max 0 25 ^
        -o chm.png

las2dem -i GroundClassified.laz ^
        -keep_class 2 -extra_pass ^
        -step 2 ^ 
        -hillshade ^
        -o dtm.png

Here you can download the resulting color-coded CHM and the resulting hill-shaded DTM as Google Earth KMZ overlays. Clearly the resulting CHM is only meaningful in the target forest where we used the manually collected ground points to create a reasonable DTM. In the other forested areas the ground is only correct near the forest edges and gets worse with increasing distance from open areas. The resulting DTM exhibits some interesting looking  bumps in the middle of areas with manually collected ground point. Those are a result of using the dense-matching points as ground whenever their elevation is lower than that of the manually collected points (which is decided in the lasthin step). Whether those bumps represent true elevations of are artifacts of low erroneous elevation from dense-matching remains to be investigated.

For forests on complex and steep terrain the number of ground points that needs to be manually collected may make such an approach infeasible in practice. However, maybe you have another source of elevation, such as a low-resolution DTM of 10 or 25 meter provided by your local government. Or maybe even a high resolution DTM of 1 or 2 meter from a LiDAR survey you did several years ago. While the forest may have grown a lot in the past years, the ground under the forest will probably not have changed much …

Plots to Stands: Producing LiDAR Vegetation Metrics for Imputation Calculations

Some professionals in remote sensing find LAStools a useful tool to extract statistical metrics from LiDAR that are used to make estimations about a larger area of land from a small set of sample plots. Common applications are prediction of the timber volume or the above-ground biomass for entire forests based on a number of representative plots where exact measurements were obtained with field work. The same technique can also be used to make estimations about animal habitat or coconut yield or to classify the type of vegetation that covers the land. In this tutorial we describe the typical workflow for computing common metrics for smaller plots and larger areas using LAStools.

Download these six LiDAR tiles (1, 2, 3, 4, 5, 6) from a Eucalyptus plantation in Brazil to follow along the step by step instructions of this tutorial. This data is courtesy of Suzano Pulp and Paper. Please also download the two shapefiles that delineate the plots where field measurements were taken and the stands for which predictions are to be made. You should download version 170327 (or higher) of LAStools due to some recent bug fixes.

Quality Checking

Before processing newly received LiDAR data we always perform a quality check first. This ranges from visual inspection with lasview, to printing textual content reports and attribute histograms with lasinfo, to flight-line alignment checks with lasoverlap, pulse density and pulse spacing checks with lasgrid and las2dem, and completeness-of-returns check with lassort followed by lasreturn.

lasinfo -i tiles_raw\CODL0003-C0006.laz ^
        -odir quality -odix _info -otxt

The lasinfo report tells us that there is no projection information. However, we remember that this Brazilian data was in the common SIRGAS 2000 projection and try for a few likely UTM zones whether the hillshaded DSM produced by las2dem falls onto the right spot in Google Earth.

las2dem -i tiles_raw\CODL0003-C0006.laz ^
        -keep_first -thin_with_grid 1 ^
        -hillshade -epsg 31983 ^
        -o epsg_check.png

Hillshaded DSM and Google Earth imagery align for EPSG code 31983

The lasinfo report also tells us that the xyz coordinates are stored with millimeter resolution which is a bit of an overkill. For higher and faster LASzip compression we will later lower this to a more appropriate centimeter resolution. It further tells us that the returns are stored using point type 0 and that is a bit unfortunate. This (older) point type does not have a GPS time stamp so that some quality checks (e.g. “completeness of returns” with lasreturn) and operations (e.g. “resorting of returns into acquisition order” with lassort) will not be possible. Fortunately the min-max range of the ‘point source ID’ suggests that this point attribute is correctly populated with flightline numbers so that we can do a check for overlap and alignment of the different flightlines that contribute to the LiDAR in each tile.

lasoverlap -i tiles_raw\*.laz ^
           -min_diff 0.2 -max_diff 0.4 ^
           -epsg 31983 ^
           -odir quality -opng ^
           -cores 3

We run lasoverlap to visualize the amount of overlap between flightlines and the vertical differences between them. The produced images (see below) color code the number of flightlines and the maximum vertical difference between any two flightlines as seen below. Most of the area is cyan (2 flightlines) except in the bottom left where the pilot was sloppy and left some gaps in the yellow seams (3 flightlines) so that some spots are only blue (1 flightline). We also see that two tiles in the upper left are partly covered by a diagonal flightline. We will drop that flightline later to create a more uniform density.across the tiles. The mostly blue areas in the difference are mostly aligned with features in the landscape and less with the flightline pattern. Closer inspection shows that these vertical difference occur mainly in the dense old growth forests with species of different heights that are much harder to penetrate by the laser than the uniform and short-lived Eucalyptus plantation that is more of a “dead forest” with little undergrowth or animal habitat.

Interesting observation: The vertical difference of the lowest return from different flightlines computed per 2 meter by 2 meter grid cell could maybe be used a new forestry metric to help distinguish mono cultures from natural forests.

lasgrid -i tiles_raw\*.laz ^
        -keep_last ^
        -step 2 -point_density ^
        -false -set_min_max 10 20 ^
        -epsg 31983 ^
        -odir quality -odix _d_2m_10_20 -opng ^
        -cores 3

lasgrid -i tiles_raw\*.laz ^
        -keep_last ^
        -step 5 -point_density ^
        -false -set_min_max 10 20 ^
        -epsg 31983 ^
        -odir quality -odix _d_5m_10_20 -opng ^
        -cores 3

We run lasgrid to visualize the pulse density per 2 by 2 meter cell and per 5 by 5 meter cell. The produced images (see below) color code the number of last return per square meter. The impact of the tall Eucalyptus trees on the density per cell computation is evident for the smaller 2 meter cell size in form of a noisy blue/red diagonal in the top right as well as a noisy blue/red area in the bottom left. Both of those turn to a more consistent yellow for the density per cell computation with larger 5 meter cells. Immediately evident is the higher density (red) for those parts or the two tiles in the upper left that are covered by the additional diagonal flightline. The blueish area left to the center of the image suggests a consistently lower pulse density whose cause remains to be investigated: Lower reflectivity? Missing last returns? Cloud cover?

The lasinfo report suggests that the tiles are already classified. We could either use the ground classification provided by the vendor or re-classify the data ourselves (using lastilelasnoise, and lasground). We check the quality of the ground classification by visually inspecting a hillshaded DTM created with las2dem from the ground returns. We buffer the tiles on-the-fly for a seamless hillshade without artifacts along tile boundaries by adding ‘-buffered 25’ and ‘-use_orig_bb’ to the command-line. To speed up reading the 25 meter buffers from neighboring tiles we first create a spatial indexing with lasindex.

lasindex -i tiles_raw\*.laz ^
         -cores 3

las2dem -i tiles_raw\*.laz ^
        -buffered 25 ^
        -keep_class 2 -thin_with_grid 0.5 ^
        -use_orig_bb ^
        -hillshade -epsg 31983 ^
        -odir quality -odix _dtm -opng ^
        -cores 3

hillshaded DTM tiles generated with las2dem and on-the-fly buffering

The resulting hillshaded DTM shows a few minor issues in the ground classification but also a big bump (above the mouse cursor). Closer inspection of this area (you can cut it from the larger tile using the command-line below) shows that there is a group of miss-classified points about 1200 meters below the terrain. Hence, we will start from scratch to prepare the data for the extraction of forestry metrics.

las2las -i tiles_raw\CODL0004-C0006.laz ^
        -inside_tile 207900 7358350 100 ^
        -o bump.laz

lasview -i bump.laz

bump in hillshaded DTM caused by miss-classified low noise

Data Preparation

Using lastile we first tile the data into smaller 500 meter by 500 meter tiles with 25 meter buffer while flagging all points in the buffer as ‘withheld’. In the same step we lower the resolution to centimeter and put nicer a coordinate offset in the LAS header. We also remove the existing classification and classify all points that are much lower than the target terrain as class 7 (aka noise). We also add CRS information and give each tile the base name ‘suzana.laz’.

lastile -i tiles_raw\*.laz ^
        -rescale 0.01 0.01 0.01 ^
        -auto_reoffset ^
        -set_classification 0 ^
        -classify_z_below_as 500.0 7 ^
        -tile_size 500 ^
        -buffer 25 -flag_as_withheld ^
        -epsg 31983 ^
        -odir tiles_buffered -o suzana.laz

With lasnoise we mark the many isolated points that are high above or below the terrain as class 7 (aka noise). Using two tiles we played around with the ‘step’ parameters until we find good parameter settings. See the README of lasnoise for the exact meaning and the choice of parameters for noise classification. We look at one of the resulting tiles with lasview.

lasnoise -i tiles_buffered\*.laz ^
         -step_xy 4 -step_z 2 ^
         -odir tiles_denoised -olaz ^
         -cores 3

lasview -i tiles_denoised\suzana_206000_7357000.laz ^
        -color_by_classification ^
        -win 1024 192

noise points shown in pink: all points (top), only noise points (bottom)

Next we use lasground to classify the last returns into ground (2) and non-ground (1). It is important to ignore the noise points with classification 7 to avoid the kind of bump we saw in the vendor-delivered classification. We again check the quality of the computed ground classification by producing a hillshaded DTM with las2dem. Here the las2dem command-line is sightly different as instead of buffering on-the-fly we use the buffers stored with each tile.

lasground -i tiles_denoised\*.laz ^
          -ignore_class 7 ^
          -nature -extra_fine ^
          -odir tiles_ground -olaz ^
          -cores 3

las2dem -i tiles_ground\*.laz ^
        -keep_class 2 -thin_with_grid 0.5 ^
        -hillshade ^
        -use_tile_bb ^
        -odir quality -odix _dtm_new -opng ^
        -cores 3

Finally, with lasheight we compute how high each return is above the triangulated surface of all ground returns and store this height value in place of the elevation value into the z coordinate using the ‘-replace_z’ switch. This height-normalizes the LiDAR in the sense that all ground returns are set to an elevation of 0 while all other returns get an elevation relative to the ground. The result are height-normalized LiDAR tiles that are ready for producing the desired forestry metrics.

lasheight -i tiles_ground\*.laz ^
          -replace_z ^
          -odir tiles_normalized -olaz ^
          -cores 3
Metric Production

The tool for computing the metrics for the entire area as well as for the individual field plots is lascanopy. Which metrics are well suited for your particular imputation calculation is your job to determine. Maybe first compute a large number of them and then eliminate the redundant ones. Do not use any point from the tile buffers for these calculations. We had flagged them as ‘withheld’ during the lastile operation, so they are easy to drop. We also want to drop the noise points that were classified as 7. And we were planning to drop that additional diagonal flightline we noticed during quality checking. We generated two lasinfo reports with the ‘-histo point_source 1’ option for the two tiles it was covering. From the two histograms it was easy to see that the diagonal flightline has the number 31.

First we run lascanopy on the 11 plots that you can download here. When running on plots it makes sense to first create a spatial indexing with lasindex for faster querying so that only those tiny parts of the LAZ file need to be loaded that actually cover the plots.

lasindex -i tiles_normalized\*.laz ^
         -cores 3

lascanopy -i tiles_normalized\*.laz -merged ^
          -drop_withheld ^
          -drop_class 7 ^
          -drop_point_source 31 ^
          -lop WKS_PLOTS.shp ^
          -cover_cutoff 3.0 ^
          -cov -dns ^
          -height_cutoff 2.0 ^
          -c 2.0 999.0 ^
          -max -avg -std -kur ^
          -p 25 50 75 95 ^
          -b 30 50 80 ^
          -d 2.0 5.0 10.0 50.0 ^
          -o plots.csv

The resulting ‘plots.csv’ file you can easily process in other software packages. It contains one line for each polygonal plot listed in the shapefile that lists its bounding box followed by all the requested metrics. But is why is there a zero maximum height (marked in orange) for plots 6 though 10? All height metrics are computed solely from returns that are higher than the ‘height_cutoff’ that was set to 2 meters. We added the ‘-c 2.0 999.0’ absolute count metric to keep track of the number of returns used in these calculations. Apparently in plots 6 though 10 there was not a single return above 2 meters as the count (also marked in orange) is zero for all these plots. Turns out this Eucalyptus stand had recently been harvested and the new seedlings are still shorter than 2 meters.

more plots.csv
index,min_x,min_y,max_x,max_y,max,avg,std,kur,p25,p50,p75,p95,b30,b50,b80,c00,d00,d01,d02,cov,dns
0,206260.500,7358289.909,206283.068,7358312.477,11.23,6.22,1.91,2.26,4.71,6.01,7.67,9.5,26.3,59.7,94.2,5359,18.9,41.3,1.5,76.3,60.0
1,206422.500,7357972.909,206445.068,7357995.477,13.54,7.5,2.54,1.97,5.32,7.34,9.65,11.62,26.9,54.6,92.2,7030,12.3,36.6,13.3,77.0,61.0
2,206579.501,7358125.909,206602.068,7358148.477,12.22,5.72,2.15,2.5,4.11,5.32,7.26,9.76,46.0,73.7,97.4,4901,24.8,28.7,2.0,66.8,51.2
3,206578.500,7358452.910,206601.068,7358475.477,12.21,5.68,2.23,2.64,4.01,5.14,7.18,10.04,48.3,74.1,95.5,4861,25.7,26.2,2.9,68.0,50.2
4,206734.501,7358604.910,206757.068,7358627.478,15.98,10.3,2.18,2.64,8.85,10.46,11.9,13.65,3.3,27.0,91.0,4946,0.6,32.5,44.5,91.0,77.5
5,207043.501,7358761.910,207066.068,7358784.478,15.76,10.78,2.32,3.43,9.27,11.03,12.49,14.11,3.2,20.7,83.3,4819,1.5,24.7,51.0,91.1,76.8
6,207677.192,7359630.526,207699.760,7359653.094,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
7,207519.291,7359372.366,207541.859,7359394.934,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
8,207786.742,7359255.850,207809.309,7359278.417,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
9,208159.017,7358997.344,208181.584,7359019.911,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0
10,208370.909,7358602.565,208393.477,7358625.133,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0

Then we run lascanopy on the entire area and produce one raster per tile for each metric. Here we remove the buffered points with the ‘-use_tile_bb’ switch that also ensures that the produced rasters have exactly the extend of the tiles without buffers. Again, it is imperative that you download the version 170327 (or higher) of LAStools for this to work correctly.

lascanopy -version
LAStools (by martin@rapidlasso.com) version 170327 (academic)

lascanopy -i tiles_normalized\*.laz ^
          -use_tile_bb ^
          -drop_class 7 ^
          -drop_point_source 31 ^
          -step 10 ^
          -cover_cutoff 3.0 ^
          -cov -dns ^
          -height_cutoff 2.0 ^
          -c 2.0 999.0 ^
          -max -avg -std -kur ^
          -p 25 50 75 95 ^
          -b 30 50 80 ^
          -d 2.0 5.0 10.0 50.0 ^
          -odir tile_metrics -oasc ^
          -cores 3

The resulting rasters in ASC format can easily be previewed using lasview for some “sanity checking” that our metrics make sense and to get a quick overview about what these metrics look like.

lasview -i tile_metrics\suzana_*max.asc
lasview -i tile_metrics\suzana_*p95.asc
lasview -i tile_metrics\suzana_*p50.asc
lasview -i tile_metrics\suzana_*p25.asc
lasview -i tile_metrics\suzana_*cov.asc
lasview -i tile_metrics\suzana_*d00.asc
lasview -i tile_metrics\suzana_*d01.asc
lasview -i tile_metrics\suzana_*b30.asc
lasview -i tile_metrics\suzana_*b80.asc

The maximum height rasters are useful to inspect more closely as they will immediately tell us if there was any high noise point that slipped through the cracks. And indeed it happened as we see a maximum of 388.55 meters for of the 10 by 10 meter cells. As we know the expected height of the trees we could have added a ‘-drop_z_above 70’ to the lascanopy command line. Careful, however, when computing forestry metrics in strongly sloped terrains as the terrain slope can significantly lift up returns to heights much higher than that of the tree. This is guaranteed to happen for LiDAR returns from branches that are extending horizontally far over the down-sloped part of the terrain as shown in this paper here.

We did not use the shapefile for the stands in this exercise. We could have clipped the normalized LiDAR points to these stands using lasclip as shown in the GUI below before generating the raster metrics. This would have saved space and computation time as many of the LiDAR points lie outside of the stands. However, it might be better to do that clipping step on the rasters in whichever GIS software or statistics package you are using for the imputation computation to properly account for partly covered raster cells along the stand boundary. This could be subject of another blog article … (-:

not all LiDAR was needed to compute metrics for

Leaked: “Classified LiDAR” of Pentagon in LAS 1.4 Format

LiDAR leaks have happened! Black helicopters are in the sky!  A few days ago a tiny tweet leaked the online location of “classified LiDAR” for Washington, DC. This LiDAR really is “classified” and includes an aerial scan of the Pentagon. For rogue scientists world-wide we offer a secret download link. It links to a file code-named ‘pentagon.laz‘ that contains the 8,044,789 “classified” returns of the Pentagon shown below. This “classified file” can be deciphered by any software with native LAZ support. It was encrypted with the “LAS 1.4 compatibility mode” of LASzip. The original LAS 1.4 content was encoded into a inconspicuous-looking LAZ file. New point attributes (such as the scanner channel) were hidden as “extra bytes” for fully lossless encryption. Use ‘laszip‘ to fully decode the original “classified” LAS 1.4 file … (-;

Seriously, a tiled LiDAR data set for the District of Columbia flown in 2015 is available for anyone to use on Amazon S3 with a very permissive open data license, namely the Creative Commons Attribution 3.0 License. The LiDAR coverage can be explored via this interactive map. The tiles are provided in LAS 1.4 format and use the new point type 6. We downloaded a few tiles near the White House, the Capitol, and the Pentagon to test the “native LAS 1.4 extension” of our LASzip compressor which will be released soon (a prototype for testing is already available). As these uncompressed LAS files are YUUUGE we use the command line utility ‘wget‘ for downloading. With option ‘-c’ the download continues where it left off in case the transfer gets interrupted.

LiDAR pulse density from 20 or less (blue) to 100 or more (red) pulses per square meter.

We use lasboundary to create labeled bounding boxes for display in Google Earth and lasgrid to a create false color visualization of pulse density with the command lines shown below. Pulse densities of 20 or below are mapped to blue. Pulse densities of 100 or above are mapped to red. We picked the min value 20 and the max value 100 for this false color mapping by running lasinfo with the ‘-cd’ option to compute an average pulse density and then refining the numbers experimentally. We also use lasoverlap to visualize how flightlines overlap and how well they align. Vertical differences of up to 20 cm are mapped to white and differences of 40 cm or more are mapped to saturated blue or red.

lasboundary -i *.las ^
            -use_bb ^
            -labels ^
            -odir quality -odix _bb -okml

lasgrid -i *.las ^
        -keep_last ^
        -point_density -step 2 ^
        -false -set_min_max 20 100 ^
        -odir quality -odix _d_20_100 -opng ^
        -cores 2

lasoverlap -i *.las ^
           -min_diff 0.2 -max_diff 0.4 ^
           -odir quality -opng ^
           -cores 2

The visualization of the pulse density and of the flightline overlap both show that there is no LiDAR for the White House or Capitol Hill. We will never know how tall the tomato and kale plants had grown in Michelle Obama’s organic garden on that day. Note that the White House and Capitol Hill were not simply “cut out”. Instead the flight plan of the survey plane was carefully designed to avoid these areas. Surprisingly, the Pentagon did not receive the same treatment and is (almost) fully included in the open LiDAR as mentioned in the dramatic first paragraph. Interesting is how the varying (tidal?) water level of the Potomac River shows up in the visualization of flightline miss-alignments.

There are a number of issues in these LiDAR files. The most serious ones are reported at the very end of this article. We will now scrutinize the partly-filled tile 2016.las close to the White House with only 11,060,334 returns. A lasvalidate check immediately reports three deviations from the LAS 1.4 specification:

lasvalidate -i 2016.las -o 2016_check.xml
  1. For proper LAS 1.4 files containing point type 6 through 10 all ‘legacy’ point counts in the LAS header should be set to 0. The following six fields in the LAS header should be zero for tile 2016.las (and all other tiles):
    + legacy number of point records
    + legacy number of points by return[0]
    + legacy number of points by return[1]
    + legacy number of points by return[2]
    + legacy number of points by return[3]
    + legacy number of points by return[4]
  2. There should not be any LiDAR return in a valid LAS file whose ‘number of returns of given pulse’ attribute is zero but there are 8 such points in tile 2016.las (and many more in various other tiles).
  3. There should not be any LiDAR return whose ‘return number’ attribute is larger than their ‘number of returns of given pulse’ attribute but there are 8 such points in tile 2016.las (and many more in various other tiles).

The first issue is trivial. There is an efficient in-place fix that does not require to rewrite the entire file using lasinfo with the following command line:

lasinfo -i 2016.las ^
        -nh -nv -nc ^
        -set_number_of_point_records 0 ^
        -set_number_of_points_by_return 0 0 0 0 0 ^

A quick check with las2txt shows us that the second and third issue are caused by the same eight points. Instead of writing an 8 for the ‘number of returns’ attribute the LAS file exporter must have written a 0 (marked in red for all eight returns) and instead of writing an 8 for the ‘return number’ attribute the LAS file exporter must have written a 1 (also marked in red). We can tell it from the true first return via its z coordinate (marked in blue) as the last return should be the lowest of all.

las2txt -i 2016.las ^
        -keep_number_of_returns 0 ^
        -parse xyzrnt ^
        -stdout
397372.70 136671.62 33.02 4 0 112813299.954811
397372.03 136671.64 28.50 5 0 112813299.954811
397371.28 136671.67 23.48 6 0 112813299.954811
397370.30 136671.68 16.86 7 0 112813299.954811
397369.65 136671.70 12.50 1 0 112813299.954811
397374.37 136671.58 44.17 3 0 112813299.954811
397375.46 136671.56 51.49 1 0 112813299.954811
397374.86 136671.57 47.45 2 0 112813299.954811

With las2las we can change the ‘number of returns’ from 0 to 8 using a ‘-filtered_transform’ as illustrated in the command line below. We suspect that higher number of returns such as 9 or 10 might have been mapped to 1 and 2. Fixing those as well as repairing the wrong return numbers will require a more complex tool. We would recommend to check all tiles with more scrutiny using the lasreturn tool. But wait … more return numbering issues are to come.

las2las -i 2016.las ^
        -keep_number_of_returns 0 ^
        -filtered_transform ^
        -set_extended_number_of_returns 8 ^
        -odix _fixed -olas

A closer look at the scan pattern reveals that the LiDAR survey was flown with a dual-beam system where two laser beams scan the terrain simultaneously. This is evident in the textual representation below as there are multiple “sets of returns” for the same GPS time stamp such as 112813952.110394. We group the returns from the two beams into an orange and a green group. Their coordinates show that the two laser beams point into different directions when they are simultaneously “shot” and therefore hit the terrain far apart from another.

las2txt -i 2016.las ^
        -keep_gps_time 112813952.110392 112813952.110396 ^
        -parse xyzlurntp ^
        -stdout
397271.40 136832.35 54.31 0 0 1 1 112813952.110394 117
397277.36 136793.35 38.68 0 1 1 4 112813952.110394 117
397277.35 136793.56 32.89 0 1 2 4 112813952.110394 117
397277.34 136793.88 24.13 0 1 3 4 112813952.110394 117
397277.32 136794.25 13.66 0 1 4 4 112813952.110394 117

The information about which point is from which beam is currently stored into the generic ‘user data’ attribute instead of into the dedicated ‘scanner channel’ attribute. This can be fixed with las2las as follows.

las2las -i 2016.las ^
        -copy_user_data_into_scanner_channel ^
        -set_user_data 0 ^
        -odix _fixed -olas

Unfortunately the LiDAR files have much more serious issues in the return numbering. It’s literally a “Total Disaster!” and “Sad!” as the US president will tweet shortly. After grouping all returns with the same GPS time stamp into an orange and a green group there is one more set of returns left unaccounted for.

las2txt -i 2016.las ^
        -keep_gps_time 112813951.416451 112813951.416455 ^
        -parse xyzlurntpi ^
        -stdout
397286.02 136790.60 45.90 0 0 1 4 112813951.416453 117 24
397286.06 136791.05 39.54 0 0 2 4 112813951.416453 117 35
397286.10 136791.51 33.34 0 0 3 4 112813951.416453 117 24
397286.18 136792.41 21.11 0 0 4 4 112813951.416453 117 0
397286.12 136791.75 30.07 0 0 1 1 112813951.416453 117 47
397291.74 136750.70 45.86 0 1 1 1 112813951.416453 117 105
las2txt -i 2016.las ^
        -keep_gps_time 112813951.408708 112813951.408712 ^
        -parse xyzlurntpi ^
        -stdout
397286.01 136790.06 45.84 0 0 1 4 112813951.408710 117 7
397286.05 136790.51 39.56 0 0 2 4 112813951.408710 117 15
397286.08 136790.96 33.33 0 0 3 4 112813951.408710 117 19
397286.18 136792.16 17.05 0 0 4 4 112813951.408710 117 0
397286.11 136791.20 30.03 0 0 1 2 112813951.408710 117 58
397286.14 136791.67 23.81 0 0 2 2 112813951.408710 117 42
397291.73 136750.16 45.88 0 1 1 1 112813951.408710 117 142

This can be visualized with lasview and the result is unmistakably clear: The return numbering is messed up. There should be one shot with five returns (not a group of four and a single return) in the first example. And there should be one shot with six returns (not a group of four and a group of two returns) in the second example. Such a broken return numbering results in extra first (or last) returns. These are serious issues that affect any algorithm that relies on the return numbering such as first-return DSM generation or canopy cover computation. Those extra returns will also make the pulse density appear higher and the pulse spacing appear tighter than they really are. The numbers from 20 (blue) to 100 (red) pulses per square meters in our earlier visualization are definitely inflated.

lasview -i 2016.las ^
        -keep_gps_time 112813951.416451 112813951.416455 ^
        -color_by_return

lasview -i 2016.las ^
        -keep_gps_time 112813951.408708 112813951.408712 ^
        -color_by_return

After all these troubles here something nice. Side-by-side a first-return TIN and a spike-free TIN (using a freeze of 0.8 m) of the center court cafe in the Pentagon. Especially given all these “fake first returns” in the Washington DC LiDAR we really need the spike-free algorithm to finally “Make a DSM great again!” … (-;

We would like to acknowledge the District of Columbia Office of the Chief Technology Officer (OCTO) for providing this data with a very permissive open data license, namely the Creative Commons Attribution 3.0 License.

 

NRW Open LiDAR: Merging Points into Proper LAS Files

In the first part of this series we downloaded, compressed, and viewed some of the newly released open LiDAR data for the state of North Rhine-Westphalia. In the second part we look at how to merge the multiple point clouds provided back into single LAS or LAZ files that are as proper as possible. Follow along with a recent version of LAStools and a pair of DGM and DOM files for your area of interest. For downloading the LiDAR we suggest using the wget command line tool with option ‘-c’ that after interruption in transmission will restart where it left off.

In the first part of this series we downloaded the pair of DGM and DOM files for the City of Bonn. The DGM file and the DOM file are zipped archives that contain the points in 1km by 1km tiles stored as x, y, z coordinates with centimeter resolution. We had already converted these textual *.xyz files into binary *.laz files. We did this with the open source LASzip compressor that is distributed with LAStools as described in that blog post. We continue now with the assumption that you have converted all of the *.xyz files to *.laz files as described here.

Mapping from tile names of DGM and DOM archives to classification and return type of points.

The mapping from tile names in DGM and DOM archives to the classification and return type of points: lp = last return. fp = first return, ab,aw,ag = synthetic points

There are multiple tiles for each square kilometer as the LiDAR has been split into different files based on classification and return type. Furthermore there are also synthetic points that were created by the land survey department to replace LiDAR under bridges and along buildings for generating higher quality rasters. We want to combine all points of a square kilometer into a single LAZ tile as it is usually expected. Simply merging the multiple files per tile is not an option as this would result in loosing point classifications and return type information as well as in duplicating all single returns that are stored in more than one file. The folks at OpenNRW offer this helpful graphic to know what the acronyms above mean:

Illustration of how acronyms used in tile names correspond to point classification and type.

Illustration of how acronyms used in tile names correspond to point classification and type.

In the following we’ll be looking at the set of files corresponding to the UTM tile 32366 / 5622. We wanted an interesting area with large buildings, a bridge, and water. We were looking for a suitable area using the KML overlays generated in part one. The tile along the Rhine river selected in the picture below covers the old city hall, the opera house, and the “Kennedy Bridge” has a complete set of DGM and DOM files:

      3,501 dgm1l-ab_32366_5622_1_nw.laz
     16,061 dgm1l-ag_32366_5622_1_nw.laz
      3,269 dgm1l-aw_32366_5622_1_nw.laz
    497,008 dgm1l-brk_32366_5622_1_nw.laz
  7,667,715 dgm1l-lpb_32366_5622_1_nw.laz
 12,096,856 dgm1l-lpnb_32366_5622_1_nw.laz
     15,856 dgm1l-lpub_32366_5622_1_nw.laz

      3,269 dom1l-aw_32366_5622_1_nw.laz
 21,381,106 dom1l-fp_32366_5622_1_nw.laz
We find the name of the tiles that cover the "Kennedy Bridge" using the KML overlays generated in part one.

We find the name of the tile that covers the “Kennedy Bridge” using the KML overlays generated in part one.

We now assign classification codes and flags to the returns from the different files using las2las, merge them together with lasmerge, and recover single, first, and last return information with lasduplicate. We set classifications to bridge deck (17), ground (2), to unclassified (1), and to noise (7) for all returns in the files with the acronym ‘brk’ (= bridge points), the acronym ‘lpb’ (= last return ground), the acronym ‘lpnb’ (= last return non-ground), and the acronym ‘lpub’ (= last return under ground). with las2las and store the resulting files to a temporary folder.

las2las -i dgm1l-brk_32366_5622_1_nw.laz ^
        -set_classification 17 ^
        -odir temp -olaz

las2las -i dgm1l-lpb_32366_5622_1_nw.laz ^
        -set_classification 2 ^
        -odir temp -olaz

las2las -i dgm1l-lpnb_32366_5622_1_nw.laz ^
        -set_classification 1 ^
        -odir temp -olaz

las2las -i dgm1l-lpub_32366_5622_1_nw.laz ^
        -set_classification 7 ^
        -odir temp -olaz

Next we use the synthetic flag of the LAS format specification to flag any additional points that were added (no measured) by the survey department to generate better raster products. We set classifications to ground (2) and the synthetic flag for all points of the files with the acronym ‘ab’ (= additional ground) and the acronym ‘ag’ (= additional building footprint). We set classifications to water (9) and the synthetic flag for all points of the files with the acronym ‘aw’ (= additional water bodies). Files with acronym ‘aw’ appear both in the DGM and DOM archive. Obviously we need to keep only one copy.

las2las -i dgm1l-ab_32366_5622_1_nw.laz ^
        -set_classification 2 ^
        -set_synthetic_flag 1 ^
        -odir temp -olaz

las2las -i dgm1l-ag_32366_5622_1_nw.laz ^
        -set_classification 2 ^
        -set_synthetic_flag 1 ^
        -odir temp -olaz

las2las -i dgm1l-aw_32366_5622_1_nw.laz ^
        -set_classification 9 ^
        -set_synthetic_flag 1 ^
        -odir temp -olaz

Using lasmerge we merge all returns from files with acronyms ‘brk’ (= bridge points), ‘lpb’ (= last return ground),  ‘lpnb’ (= last return non-ground), and ‘lpub’ (= last return under ground) into a single file that will then contain all of the (classified) last returns for this tile.

lasmerge -i temp\dgm1l-brk_32366_5622_1_nw.laz ^
         -i temp\dgm1l-lpb_32366_5622_1_nw.laz ^
         -i temp\dgm1l-lpnb_32366_5622_1_nw.laz ^
         -i temp\dgm1l-lpub_32366_5622_1_nw.laz ^
         -o temp\dgm1l-lp_32366_5622_1_nw.laz

Next we run lasduplicate three times to recover which points are single returns and which points are the first and the last return of a pair of points generated by the same laser shot. First we run lasduplicate with option ‘-unique_xyz’ to remove any xyz duplicates from the last return file. We also mark all surviving returns as the second of two returns. Similarly, we remove any xyz duplicates from the first return file and mark all survivors as the first of two returns. Finally we run lasduplicate with option ‘-single_returns’ with the unique last and the unique first return files as ‘-merged’ input. If a return with the exact same xyz coordinates appears in both files only the first copy is kept and marked as a single return. In order to keep the flags and classifications from the last return file, the order in which the last and first return files are listed in the command line is important.

lasduplicate -i temp\dgm1l-lp_32366_5622_1_nw.laz ^
             -set_return_number 2 -set_number_of_returns 2 ^
             -unique_xyz ^
             -o temp\last_32366_5622_1_nw.laz

lasduplicate -i dom1l-fp_32366_5622_1_nw.laz ^
             -set_return_number 1 -set_number_of_returns 2 ^
             -unique_xyz ^
             -o temp\first_32366_5622_1_nw.laz

lasduplicate -i temp\last_32366_5622_1_nw.laz ^
             -i temp\first_32366_5622_1_nw.laz ^
             -merged ^
             -single_returns ^
             -o temp\all_32366_5622_1_nw.laz

We then add the synthetic points with another call to lasmerge to obtain a LAZ file containing all points of the tile correctly classified, flagged, and return-numbered.

lasmerge -i temp\dgm1l-ab_32366_5622_1_nw.laz ^
         -i temp\dgm1l-ag_32366_5622_1_nw.laz ^
         -i temp\dgm1l-aw_32366_5622_1_nw.laz ^
         -i temp\all_32366_5622_1_nw.laz ^
         -o temp\merged_32366_5622_1_nw.laz

Optional: For more efficient use of this file in subsequent processing – and especially to accelerate area-of-interest queries with lasindex – it is often of great advantage to reorder the points in a spatially coherent manner. A simple call to lassort will rearrange the points along a space-filling curve such as a Hilbert curve or a Z-order curve.

lassort -i temp\merged_32366_5622_1_nw.laz ^
        -o bonn_32366_5622_1_nw.laz

Note that we also renamed the file because a good name can be useful if you find that file again in two years from now. Let’s have a look at the result with lasview.

lasview -i bonn_32366_5622_1_nw.laz

In lasview you can press <c> repeatedly to switch through all available coloring modes until you see the yellow (single) / red (first) / last (blue) coloring of the returns. The recovered return types are especially evident under vegetation, in the presence of wires, and along building edges. Press <x> to select an area of interest and press <x> again to inspect it more closely. Press <i> while hovering above a point to show its coordinates, classification, and return type.

We did each processing in separate steps to illustrate the overall workflow. The above sequence of LAStools command line calls can be shortened by combining multiple processing steps into one operation. This is left as an exercise for the advanced LAStools user … (-;

Acknowledgement: The LiDAR data of OpenNRW comes with a very permissible license. It is called “Datenlizenz Deutschland – Namensnennung – Version 2.0” or “dl-de/by-2-0” and allows data and derivative sharing as well as commercial use. It only requires us to name the source. We need to cite the “Land NRW (2017)” with the year of the download in brackets and specify the Universal Resource Identification (URI) for both the DOM and the DGM. Done. So easy. Thank you, OpenNRW … (-:

NRW Open LiDAR: Download, Compression, Viewing

UPDATE: (March 6th): Second part merging Bonn into proper LAS files

This is the first part of a series on how to process the newly released open LiDAR data for the entire state of North Rhine-Westphalia that was announced a few days ago. Again, kudos to OpenNRW for being the most progressive open data state in Germany. You can follow this tutorial after downloading the latest version of LAStools as well as a pair of DGM and DOM files for your area of interest from these two download pages.

We have downloaded the pair of DGM and DOM files for the Federal City of Bonn. Bonn is the former capital of Germany and was host to the FOSS4G 2016 conference. As both files are larger than 10 GB, we use the wget command line tool with option ‘-c’ that will restart where it left off in case the transmission gets interrupted.

The DGM file and the DOM file are zipped archives that contain the points in 1km by 1km tiles stored as x, y, z coordinates in ETRS89 / UTM 32 projection as simple ASCII text with centimeter resolution (i.e. two decimal digits).

>> more dgm1l-lpb_32360_5613_1_nw.xyz
360000.00 5613026.69 164.35
360000.00 5613057.67 164.20
360000.00 5613097.19 164.22
360000.00 5613117.89 164.08
360000.00 5613145.35 164.03
[...]

There is more than one tile for each square kilometer as the LiDAR points have been split into different files based on their classification and their return type. Furthermore there are also synthetic points that were used by the land survey department to replace certain LiDAR points in order to generate higher quality DTM and DSM raster products.

The zipped DGM archive is 10.5 GB in size and contains 956 *.xyz files totaling 43.5 GB after decompression. The zipped DOM archive is 11.5 GB in size and contains 244 *.xyz files totaling 47.8 GB. Repeatedly loading these 90 GB of text data and parsing these human-readable x, y, and z coordinates is inefficient with common LiDAR software. In the first step we convert the textual *.xyz files into binary *.laz files that can be stored, read and copied more efficiently. We do this with the open source LASzip compressor that is distributed with LAStools using these two command line calls:

laszip -i dgm1l_05314000_Bonn_EPSG5555_XYZ\*.xyz ^
       -epsg 25832 -vertical_dhhn92 ^
       -olaz ^
       -cores 2
laszip -i dom1l_05314000_Bonn_EPSG5555_XYZ\*.xyz ^
       -epsg 25832 -vertical_dhhn92 ^
       -olaz ^
       -cores 2

The point coordinates are is in EPSG 5555, which is a compound datum of horizontal EPSG 25832 aka ETRS89 / UTM zone 32N and vertical EPSG 5783 aka the “Deutsches Haupthoehennetz 1992” or DHHN92. We add this information to each *.laz file during the LASzip compression process with the command line options ‘-epsg 25832’ and ‘-vertical_dhhn92’.

LASzip reduces the file size by a factor of 10. The 956 *.laz DGM files compress down to 4.3 GB from 43.5 GB for the original *.xyz files and the 244 *.laz DOM files compress down to 4.8 GB from 47.8 GB. From here on out we continue to work with the 9 GB of slim *.laz files. But before we delete the 90 GB of bulky *.xyz files we make sure that there are no file corruptions (e.g. disk full, truncated files, interrupted processes, bit flips, …) in the *.laz files.

laszip -i dgm1l_05314000_Bonn_EPSG5555_XYZ\*.laz -check
laszip -i dom1l_05314000_Bonn_EPSG5555_XYZ\*.laz -check

One advantage of having the LiDAR in an industry standard such as the LAS format (or its lossless compressed twin, the LAZ format) is that the header of the file stores the number of points per file, the bounding box, as well as the projection information that we have added. This allows us to very quickly load an overview for example, into lasview.

lasview -i dgm1l_05314000_Bonn_EPSG5555_XYZ\*.laz -GUI
The bounding boxes of the DGM files quickly display a preview of the data in the GUI when the files are in LAS or LAZ format.

The bounding boxes of the DGM files quickly give us an overview in the GUI when the files are in LAS or LAZ format.

Now we want to find a particular site in Bonn such as the World Conference Center Bonn where FOSS4G 2016 was held. Which tile is it in? We need some geospatial context to find it, for example, by creating an overview in form of KML files that we can load into Google Earth. We use the files from the DOM folder with “fp” in the name as points on buildings are mostly “first returns”. See what our previous blog post writes about the different file names or look at the second part of this series. We can create the KML files with lasboundary either via the GUI or in the command line.

lasboundary -i dom1l_05314000_Bonn_EPSG5555_XYZ\dom1l-fp*.laz ^
            -gui
Only the "fp" tiles from the DOM folder loaded the GUI into lasboundary.

Only the “fp” tiles from the DOM folder loaded the GUI into lasboundary.

lasboundary -i dom1l_05314000_Bonn_EPSG5555_XYZ\dom1l-fp*.laz ^
            -use_bb -labels -okml

We zoom in and find the World Conference Center Bonn and load the identified tile into lasview. Well, we did not expect this to happen, but what we see below will make this series of tutorials even more worthwhile. There is a lot of “high noise” in the particular tile we picked. We should have noticed the unusually high z range of 406.42 meters in the Google Earth pop-up. Is this high electromagnetic radiation interfering with the sensors? There are a number of high-tech government buildings with all kind of antennas nearby (such as the United Nations Bonn Campus the mouse cursor points at).

Significant amounts of high noise are in the first returns of the DOM tile we picked.

Significant amounts of high noise are in the first returns of the DOM tile we picked.

But the intended area of interest was found. You can see the iconic “triangulated” roof of the building that is across from the World Conference Center Bonn.

The World Conference Center Bonn is across from the building with the "triangulated" roof.

The World Conference Center Bonn is across from the building with the “triangulated” roof.

Please don’t think it is the responsibility of OpenNRW to remove the noise and provide cleaner data. The land survey has already processed this data into whatever products they needed and that is where their job ended. Any additional services – other than sharing the raw data – are not in their job description. We’ll take care of that … (-:

Acknowledgement: The LiDAR data of OpenNRW comes with a very permissible license. It is called “Datenlizenz Deutschland – Namensnennung – Version 2.0” or “dl-de/by-2-0” and allows data and derivative sharing as well as commercial use. It only requires us to name the source. We need to cite the “Land NRW (2017)” with the year of the download in brackets and specify the Universal Resource Identification (URI) for both the DOM and the DGM. Done. So easy. Thank you, OpenNRW … (-:

Creating a Better DTM from Photogrammetic Points by Avoiding Shadows

At INTERGEO 2016 in Hamburg, the guys from Aerowest GmbH shared with us a small photogrammetric point cloud from the city of Soest that had been generated with the SURE dense-matching software from nFrames. We want to test whether – using LAStools – we can generate a decent DTM from these points that are essentially a gridded DSM. If this interest you also see this, this, this, and this story.

soest_00_google_earth

Here you can download the four original tiles (tile1, tile2, tile3, tile4) that we are using in these experiments. We first re-tile them into smaller 100 meter by 100 meter tiles with a 20 meter buffer using lastile. We use option ‘-flag_as_withheld’ that flags all the points falling into the buffer using the withheld flag so they can easily be removed on-the-fly later with the ‘-drop_withheld’ filter (see the README for more on this). We also add the missing projection with ‘-epsg 32632’.

lastile -i original\*.laz ^
        -tile_size 100 -buffer 20 ^
        -flag_as_withheld ^
        -epsg 32632 ^
        -odir tiles_raw -o soest.laz

Below is a screenshot from one of the resulting 100 meter by 100 meter tiles (with 20 meter buffer) that we will be focusing on in the following experiments.

The tiles 'soest_437900_5713800.laz'

The tile ‘soest_437900_5713800.laz’ used in all experiments.

Next we use lassort to reorder the points ordered along a coherent space-filling curve as the existing scan-line order has the potential to cause our triangulation engine to slow down. We do this on 4 cores.

lassort -i tiles_raw\*.laz ^
        -odir tiles_sorted -olaz ^
        -cores 4

We then use lasthin to lower the number of points that we consider as ground points (see the README for more info on this tool). We do this because the original 5 cm spacing of the dense matching points is a bit excessive for generating a DTM with a resolution of, for example, 50 cm. Instead we only use the lowest point in each 20 cm by 20 cm cell as a candidate for becoming a ground point, which reduces the number of considered points by a factor of 16. We achieve this by classifying these lowest point to a unique classification code (here: 9) and later tell lasground to ignore all other classifications.

lasthin -i tiles_sorted\*.laz ^
        -step 0.2 -lowest -classify_as 9 ^
        -odir tiles_thinned -olaz ^
        -cores 4
Then we run lasground on 4 cores to classify the ground points with options ‘-step 20’, ‘-bulge 0.5’, ‘-spike 0.5’ and ‘-fine’ that we selected after two trials on a single tile. There are several other options in lasground to play with that may achieve better results on other data sets (see README file for more on this). The ‘-ignore_class 0’ option instructs lasground to ignore all points that are not classified so that only those points that lasthin was classifying as 9 in the previous step are considered.
lasground -i tiles_thinned\*.laz ^
          -step 20 -bulge 0.5 -spike 0.5 -fine ^
          -ignore_class 0 ^
          -odir tiles_ground -olaz ^
          -cores 4
However, when we scrutinize the resulting ground classification notice that there are bumps in the corresponding ground TIN that seem to correspond to areas where the original imagery has dark shadows that in turn seem to significantly affect the geometric accuracy of the points generated by the dense-matching algorithm.
Looking a the bump from below we identify the RGB colors of the points have that form the bump that seem to be of much lower accuracy than the rest. This is an effect that we have noticed in the past for data generated with other dense-matching software and maybe an approach similar to the one we take here could also help with this low noise. It seems that points that are generated from shadowed areas in the input images can have a lot lower accuracy than those from well-lit areas. We use this correlation between RGB color and geometric accuracy to simply exclude all points whose RGB colors indicate that they might be from shadow areas from becoming ground points.
The RGB colors of low-accuracy points are mostly from very dark shadowed areas.

The RGB colors of low-accuracy points are mostly from very dark shadowed areas.

We use las2las with the relatively new ‘-filtered_transform’ option to reclassify all points whose RGB color is close to zero to yet classification code 7 (see README file for more on this). We do this for all points whose red value is between 0 and 30, whose green value is between 0 and 35, and whose blue value is between 0 and 40. Because the RGB values were stored with 16 bits in these files we have to multiply these values with 256 when applying the filter.
las2las -i tiles_thinned\*.laz ^
        -keep_RGB_red 0 7680 ^
        -keep_RGB_green 0 8960 ^
        -keep_RGB_blue 0 10240 ^
        -filtered_transform ^
        -set_classification 7 ^
        -odir tiles_rgb_filtered -olaz ^
        -cores 4
Below you see all points that will no longer be considered because their classification was set to 7 by the command above.
Points whose RGB values indicate they might lie in the shadows.

Points whose RGB values indicate they might lie in the shadows.

We then re-run lasground with the same options ‘-step 20’, ‘-bulge 0.5’, ‘-spike 0.5’ and ‘-fine’ as before but now we ignore all points that are still have classification 0 because they were not classified as 9 by lasthin earlier and we also ignore all points that have been assigned classification 7 by las2las in the previous step.
lasground -i tiles_thinned\*.laz ^
          -step 20 -bulge 0.5 -spike 0.5 -fine ^
          -ignore_class 0 7 ^
          -odir tiles_ground -olaz ^
          -cores 4
The situation has significantly improved for the bumb we saw before as you can see in the images below.

Finally we create a DTM with blast2dem (see README) and a DSM with lasgrid (see README) by merging all points into one file but dropping the buffer points that were marked as withheld by the initial run of lastile (see README).

blast2dem -i tiles_ground\*.laz -merged ^
          -drop_withheld -keep_class 2 ^
          -step 0.5 ^
          -o dtm.bil

lasgrid -i tiles_ground\*.laz -merged ^
        -drop_withheld ^
        -step 0.5 -average ^
        -o dsm.bil
 As our venerable lasview (see README) can directly read BIL rasters as points (just like all the other LAStools), so we can compare the DTM and the DTM by loading them as two flight lines (‘-faf’) and then switch back and forth between the two by pressing ‘0’ and ‘1’ in the viewer.
lasview -i dtm.bil dsm.bil -faf

Above you see the final DTM and the original DSM. So yes, LAStools can definitely create a DTM from point clouds that are the result of dense-matching photogrammetry. We used the correlation between shadowed areas in the source image and geometric errors to remove those points from consideration for ground points that are likely to be too low and result in bumps. For dense-matching algorithms that also output an uncertainty value for each point there is the potential for even better results as our range of eliminated RGB colors may not cover all geometrically uncertain points while at the same time may be too conservative and also remove correct ground points.

Fixing Intensity Differences between Flightlines (“quick and dirty”)

Visiting our users on-site, such as last week at Mariano Marcos State University in Ilocos Norte in the Philippines, we sometimes come across situations as pictured below where the intensity values of the returns of one flightline are drastically different from that of other flightlines.

The intensity of returns in the left most flightline is different from that of other flightlines.

The intensity of returns in the left most flightline is different from that of other flightlines.

Using intensity rasters with such dark strips as an additional input for land cover classification may likely make things worse. Radiometrically “correct” intensity calibration is a complex topic and may not always be possible to do using only the LAZ files without meta information such as the internals of the scanning system and the aircraft trajectory. However, we now describe a “quick and dirty” fix to the situation shown above so that the intensity grids (that were computed as averages of first return intensities) at least “look” as sensible as for the one square tile (shown below) that was corrected by a simple multiplication with 5 for all intensities of the dark strip.

Simply multiplying all intensities of the dark flightline with 5 seems to "fix" the issue.

Simply multiplying all intensities of the dark flightline with 5 seems to “fix” the issue for our test tile.

The number 5 was determined by a quick glance at the intensity histograms that we can generate with lasinfo. We decide to only look at single returns as we expect them to have a higher correlation: Their locations are more likely to be “seen similarly” from and their energy is more likely “reflected similarly” to different flightlines compared to that of multiple returns.

lasinfo -i strip1.laz strip2.laz strip3.laz ^
        -keep_single ^
        -histo intensity 1 ^
        -nmm -nh -nv ^
        -odix _histo_int -otxt

The resulting histograms for the dark ‘strip1.laz’ is quite different from that of the much brighter ‘strip2.laz’ and ‘strip3.laz’. The average single return intensity for the dark ‘strip1.laz’ is a meager 5.13 whereas the  brighter ‘strip2.laz’ and ‘strip3.laz’ have similar averages of 24.15 and 24.50 respectively.

Draw your own conclusion about which scale factor to use. We have the choice to match either the peak of the histograms or their averages. Scaling the peak of 3 for ‘strip1.laz’ to match the 25 of the other two strips is too much of an upscaling. But the average 24.15 divided by 5.13 gives a potential scale of 4.71 and the average 24.50 divided by 5.13 gives a potential scale of 4.77 and we already saw a multiplication by 5 giving reasonable results. So this is how we can fix the intensity:

las2las -i strip1.laz ^
        -scale_intensity 4.75 ^
        -odix _corr_int -olaz

But what if your data is already in tiles? How can you adjust only the intensity of those returns that are from the flightline 1? Assuming that your flightline information is properly stored in the point source ID field of every point this is easily done with the new ‘-filtered_transform’ in LAStools using las2las on as many cores as you have as follows:

las2las -i tiles/*.laz ^
        -keep_point_source 1 ^
        -filtered_transform ^
        -scale_intensity 4.75 ^
        -odir tiles_corr -olaz ^
        -cores 8

This is not currently exposed in the GUI of las2las but you can simply add it by typing it into the ‘RUN’ pop-up window as shown below.

Scaling only the intensities of flightline 1 by 4.9 using the new '-filtered_transform'.

Scaling only the intensities of flightline 1 by 4.9 using the new ‘-filtered_transform’.

After this “quick and dirty” intensity correction we again ran lasgrid as follows:

lasgrid -i tiles_corr/*.laz ^
        -gray -set_min_max 0 60 ^
        -odir tiles_int_rasters -opng ^
        -cores 8

And the result is shown below. The obvious flightline-induced discontinuity in the intensities has pretty much disappeared. Do you have similar flightline-related intensity issues? We like to hear from you whether this technique works or if we need to implement something more clever in the future …

lasgrid_intensity_differences_3

Generating Spike-Free Digital Surface Models from LiDAR

A Digital Surface Model (DSM) represents the elevation of the landscape including all vegetation and man-made objects. An easy way to generate a DSM raster from LiDAR is to use the highest elevation value from all points falling into each grid cell. However, this “binning” approach only works when then the resolution of the LiDAR is higher than the resolution of the raster. Only then sufficiently many LiDAR points fall into each raster cell to prevent “empty pixels” and “data pits” from forming. For example, given LiDAR with an average pulse spacing of 0.5 meters one can easily generate a 2.5 meter DSM raster with simple “binning”. But to generate a 0.5 meter DSM raster we need to use an “interpolation” method.

Returns of four fightlines on two trees.

Laser pulses and discrete returns of four fightlines.

For the past twenty or so years, GIS textbooks and LiDAR tutorials have recommened to use only the first returns to construct the interpolating surface for DSM generation. The intuition is that the first return is the highest return for an airborne survey where the laser beams come (more or less) from above. Hence, an interpolating surface of all first returns is constructed – usually based on a 2D Delaunay triangulation – and the resulting Triangular Irregular Network (TIN) is rasterized onto a grid at a user-specified resolution to create the DSM raster. The same way a Canopy Height Model (CHM) is generated except that elevations are height-normalized either before or after the rasterization step. However, using a first-return interpolation for DSM/CHM generation has two critical drawbacks:

(1) Using only first returns means not all LiDAR information is used and some detail is missing. This is particularly the case for off-nadir scan angles in traditional airborne surveys. It becomes more pronounced with new scanning systems such as UAV or hand-held LiDAR where laser beams no longer come “from above”. Furthermore, in the event of clouds or high noise the first returns are often removed and the remaining returns are not renumbered. Hence, any laser shot whose first return reflects from a cloud or a bird does not contribute its highest landscape hit to the DSM or CHM.

(2) Using all first returns practically guarantees the formation of needle-shaped triangles in vegetated areas and along building roofs that appear as spikes in the TIN. This is because at off-nadir scan angles first returns are often generated far below other first returns as shown in the illustration above. The resulting spikes turn into “data pits” in the corresponding raster that not only look ugly but impact the utility of the DSM or CHM in subsequent analysis, for example, in forestry applications when attempting to extract individual trees.

In the following we present results and command-line examples for the new “spike-free” algorithm by (Khosravipour et. al, 2015, 2016) that is implemented (as a slow prototype) in the current LAStools release. This completely novel method for DSM generation triangulates all relevant LiDAR returns using Contrained Delaunay algorithm. This constructs a “spike-free” TIN that is in turn rasterized into “pit-free” DSM or CHM. This work is both a generalization and an improvement of our previous result of pit-free CHM generation.

We now compare our “spike-free” DSM to a “first-return” DSM on the two small urban data sets “france.laz” and “zurich.laz” distributed with LAStools. Using lasinfo with options ‘-last_only’ and ‘-cd’ we determine that the average pulse spacing is around 0.33 meter for “france.laz” and 0.15 meter for “zurich.laz”. We decide to create a hillshaded 0.25 meter DSM for “france.laz” and a 0.15 meter DSM for “zurich.laz” with the command-lines shown below.

las2dem -i ..\data\france.laz ^
        -keep_first ^
        -step 0.25 ^
        -hillshade ^
        -o france_fr.png
las2dem -i ..\data\france.laz ^
        -spike_free 0.9 ^
        -step 0.25 ^
        -hillshade ^
        -o france_sf.png
las2dem -i ..\data\zurich.laz ^
           -keep_first ^
           -step 0.15 ^
           -hillshade ^
           -o zurich_fr.png
las2dem -i ..\data\zurich.laz ^
        -spike_free 0.5 ^
        -step 0.15 ^
        -hillshade ^
        -o zurich_sf.png

The differences between a first-return DSM and a spike-free DSM are most drastic along building roofs and in vegetated areas. To inspect in more detail the differences between a first-return and our spike-free TIN we use lasview that allows to iteratively visualize the construction process of a spike-free TIN.

lasview -i ..\data\france.laz -spike_free 0.9

Pressing <f> and <t> constructs the first-return TIN. Pressing <SHIFT> + <t> destroys the first-return TIN. Pressing <SHIFT> + <y> constructs the spike-free TIN. Pressing <y> once destroys the spike-free TIN. Pressing <y> many times iteratively constructs the spike-free TIN.

One crucial piece of information is still missing. What value should you use as the freeze constraint of the spike-free algorithm that we set to 0.9 for “france.laz” and to 0.5 for “zurich.laz” as the argument to the command-line option ‘-spike_free’. The optimal value is related to the expected edge-length and we found the 99th percentile of a histogram of edge lengths of the last-return TIN to be useful. Or simpler … try a value that is about three times the average pulse spacing.

References:
Khosravipour, A., Skidmore, A.K., Isenburg, M. and Wang, T.J. (2015) Development of an algorithm to generate pit-free Digital Surface Models from LiDAR, Proceedings of SilviLaser 2015, pp. 247-249, September 2015.
Khosravipour, A., Skidmore, A.K., Isenburg, M (2016) Generating spike-free Digital Surface Models using raw LiDAR point clouds: a new approach for forestry applications, (journal manuscript under review).

Create proper LAS 1.4 files with LAStools (for free)

So your LiDAR processing workflow produces beautiful LAS or LAZ output. You think the files are nothing short of perfect. But they are in LAS 1.2 format and the tender document explicitly requests delivery in the latest LAS 1.4 format. The free and open source las2las module of LAStools – also known as the Swiss Army knife of LiDAR processing – can easily up-convert them.

There are two possible scenarios: (1) The client wants the data as LAS 1.4 files but does not care about which point type is used. (2) The client wants LAS 1.4 and specifically asks for one of the new point types such as point type 6.

Let us assume you have LAS 1.2 content with point type 1 like these LAZ files from Martin county in Minnesota. Use these four tiles 5126-05-42.laz5126-05-43.laz5126-06-42.laz, and 5126-06-43.laz to repeat the LAS 1.4 up-conversion we are doing in the following. You will need to use version 160110 of LAStools (or newer) that you can download here.

A report generated with lasinfo shows geo-referencing information stored as GeoTIFF keys in the first VLR. Then there are 10 useless numbers stored in the second VLR that seem left-over from an earlier version of geo-referencing without EPSG codes. The two vendor-specific ‘NIIRS10’ VLRs should probably be removed unless they are meaningful to you. The LAS header is followed by 1564 user-defined bytes (sometimes used as “padding”) that we can also delete before delivery.

D:\LAStools\bin>lasinfo -i martin\5126-05-42.laz
reporting all LAS header entries:
 file signature: 'LASF'
 file source ID: 0
 global_encoding: 0
 project ID GUID data 1-4: E603549E-B74A-4696-3BAD-EA49F643C221
 version major.minor: 1.2
 system identifier: 'LAStools (c) by Martin Isenburg'
 generating software: 'lassort (110815) unlicensed'
 file creation day/year: 119/2011
 header size: 227
 offset to point data: 2163
 number var. length records: 4
 point data format: 1
 point data record length: 28
 number of point records: 13772409
 number of points by return: 13440460 240598 78743 12608 0
 scale factor x y z: 0.01 0.01 0.01
 offset x y z: 0 0 0
 min x y z: 361818.33 4855898.31 349.89
 max x y z: 364400.98 4859420.79 404.22
variable length header record 1 of 4:
 reserved 43707
 user ID 'LASF_Projection'
 record ID 34735
 length after header 40
 description 'by LAStools of Martin Isenburg'
 GeoKeyDirectoryTag version 1.1.0 number of keys 4
 key 1024 tiff_tag_location 0 count 1 value_offset 1 - GTModelTypeGeoKey: ModelTypeProjected
 key 3072 tiff_tag_location 0 count 1 value_offset 26915 - ProjectedCSTypeGeoKey: NAD83 / UTM 15N
 key 3076 tiff_tag_location 0 count 1 value_offset 9001 - ProjLinearUnitsGeoKey: Linear_Meter
 key 4099 tiff_tag_location 0 count 1 value_offset 9001 - VerticalUnitsGeoKey: Linear_Meter
variable length header record 2 of 4:
 reserved 43707
 user ID 'LASF_Projection'
 record ID 34736
 length after header 80
 description 'GeoTiff double parameters'
 GeoDoubleParamsTag (number of doubles 10)
 500000 0 -93 0.9996 0 1 6.37814e+006 298.257 0 0.0174533
variable length header record 3 of 4:
 reserved 43707
 user ID 'NIIRS10'
 record ID 1
 length after header 26
 description 'NIIRS10 Tile Index'
variable length header record 4 of 4:
 reserved 43707
 user ID 'NIIRS10'
 record ID 4
 length after header 10
 description 'NIIRS10 Timestamp'
the header is followed by 1564 user-defined bytes
LASzip compression (version 2.1r0 c2 50000): POINT10 2 GPSTIME11 2
reporting minimum and maximum for all LAS point record entries ...
 X 36181833 36440098
 Y 485589831 485942079
 Z 34989 40422
 intensity 1 4780
 return_number 1 4
 number_of_returns 1 4
 edge_of_flight_line 0 0
 scan_direction_flag 0 1
 classification 1 10
 scan_angle_rank -24 24
 user_data 0 32
 point_source_ID 5401 8015
 gps_time 243852.663596 403514.323142
number of first returns: 13440460
number of intermediate returns: 91408
number of last returns: 13440349
number of single returns: 13199808
overview over number of returns of given pulse: 13199808 323647 198449 50505 0 0 0
histogram of classification of points:
 6560507 unclassified (1)
 6744481 ground (2)
 401464 medium vegetation (4)
 55433 building (6)
 100 noise (7)
 10157 water (9)
 267 rail (10)

(1) Create LAS 1.4 files with point type 1

The las2las command shown below turns a folder of LAS 1.2 files into a folder of  LAS 1.4 files with option ‘-set_version 1.4’ without changing the point type. It removes the last three VLRs with option ‘-remove_vlrs_from_to 1 3’ and the user-defined bytes in the LAS header with option ‘-remove_padding’ (*). The result is as LAZ with option ‘-olaz’ and the task is run on up to 4 CPUs in parallel with option ‘-cores 4’.

(*) option ‘-remove_padding’ is called  ‘-remove_extra’ in older versions of LAStools.

las2las -i martin/*.laz ^
        -remove_vlrs_from_to 1 3 ^
        -remove_padding ^
        -set_version 1.4 ^
        -odir martin_LAS14_pt1 -olaz ^
        -cores 4

(2) Create LAS 1.4 files with point type 6

The las2las command shown below turns a folder of LAS 1.2 files into a folder of  LAS 1.4 files with option ‘-set_version 1.4’ and changes the point type from 1 to 6 with option ‘-set_point_type 6’. As before some VLRs and the header padding is removed. It also adds a second description of the georeferencing information by rewriting the existing information stored as GeoTIFF tags into a properly formatted OGC WKT string with option ‘-set_ogc_wkt’. Thanks to ESRI’s assault on the LAZ format the output still needs to be in LAS. It can be compressed with laszip in a subsequent step using the ‘LAS 1.4 compatibility mode‘ sponsored by NOAA.

las2las -i martin/*.laz ^
        -remove_vlrs_from_to 1 3 ^
        -remove_padding ^
        -set_version 1.4 ^
        -set_point_type 6 ^
        -set_ogc_wkt ^
        -odir martin_LAS14_pt6 -olas ^
        -cores 4

laszip -i martin_LAS14_pt6/*.las -cores 4

Below the output of lasinfo for the “upgraded” LAS 1.4 file with the OGC WKT string. Some of the key changes have been marked in red.

D:\LAStools\bin>lasinfo -i martin_LAS14_pt6\5126-05-42.las
reporting all LAS header entries:
 file signature: 'LASF'
 file source ID: 0
 global_encoding: 16
 project ID GUID data 1-4: E603549E-B74A-4696-3BAD-EA49F643C221
 version major.minor: 1.4
 system identifier: 'LAStools (c) by rapidlasso GmbH'
 generating software: 'las2las (version 160110)'
 file creation day/year: 119/2011
 header size: 375
 offset to point data: 1135
 number var. length records: 2
 point data format: 6
 point data record length: 30
 number of point records: 0
 number of points by return: 0 0 0 0 0
 scale factor x y z: 0.01 0.01 0.01
 offset x y z: 0 0 0
 min x y z: 361818.33 4855898.31 349.89
 max x y z: 364400.98 4859420.79 404.22
 start of waveform data packet record: 0
 start of first extended variable length record: 0
 number of extended_variable length records: 0
 extended number of point records: 13772409
 extended number of points by return: 13440460 240598 78743 12608 0 0 0 0 0 0 0 0 0 0 0
variable length header record 1 of 2:
 reserved 43707
 user ID 'LASF_Projection'
 record ID 34735
 length after header 40
 description 'by LAStools of Martin Isenburg'
 GeoKeyDirectoryTag version 1.1.0 number of keys 4
 key 1024 tiff_tag_location 0 count 1 value_offset 1 - GTModelTypeGeoKey: ModelTypeProjected
 key 3072 tiff_tag_location 0 count 1 value_offset 26915 - ProjectedCSTypeGeoKey: NAD83 / UTM 15N
 key 3076 tiff_tag_location 0 count 1 value_offset 9001 - ProjLinearUnitsGeoKey: Linear_Meter
 key 4099 tiff_tag_location 0 count 1 value_offset 9001 - VerticalUnitsGeoKey: Linear_Meter
variable length header record 2 of 2:
 reserved 43707
 user ID 'LASF_Projection'
 record ID 2112
 length after header 612
 description 'by LAStools of rapidlasso GmbH'
 WKT OGC COORDINATE SYSTEM:
 PROJCS["NAD83 / UTM 15N",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4269"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-93],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","26915"]]
reporting minimum and maximum for all LAS point record entries ...
 X 36181833 36440098
 Y 485589831 485942079
 Z 34989 40422
 intensity 1 4780
 return_number 1 4
 number_of_returns 1 4
 edge_of_flight_line 0 0
 scan_direction_flag 0 1
 classification 1 10
 scan_angle_rank -24 24
 user_data 0 32
 point_source_ID 5401 8015
 gps_time 243852.663596 403514.323142
 extended_return_number 1 4
 extended_number_of_returns 1 4
 extended_classification 1 10
 extended_scan_angle -4000 4000
 extended_scanner_channel 0 0
number of first returns: 13440460
number of intermediate returns: 91408
number of last returns: 13440349
number of single returns: 13199808
overview over extended number of returns of given pulse: 13199808 323647 198449 50505 0 0 0 0 0 0 0 0 0 0 0
histogram of classification of points:
 6560507 unclassified (1)
 6744481 ground (2)
 401464 medium vegetation (4)
 55433 building (6)
 100 noise (7)
 10157 water (9)
 267 rail (10)