Completeness and Correctness of Discrete LiDAR Returns per Laser Pulse fired

Again and again we have preached about the importance of quality checking when you first get your expensive LiDAR data from the vendor or your free LiDAR data from an open data portal. The minimal quality check we usually advocate consists of lasinfo, lasvalidate, lasoverlap, and lasgrid. The information computed by these LAStools can reassure you that the data contains the right information, is specification conform, has properly aligned flight lines, and has the density distribution you expect. For deliveries or downloads in LAZ format we in addition recommend running laszip with the option ‘-check’ to find the rare file that might have gotten bit-corrupted or truncated during the transfer or the download. Today we learn about a more advanced quality check that can be done by running lassort followed by lasreturn.

For every laser shot fired there are usually between one to five discrete LiDAR returns and some full-waveform systems may even deliver up to fifteen returns. Each of these one to fifteen returns is then given the exact same GPS time stamp that corresponds to the moment in time the laser pulse was fired. By having these unique GPS time stamps we can always recover the set of returns that come from the same laser shot. This makes it possible to check completeness (are all the returns in the file) and correctness (is the returns numbering correct) for the discrete returns of each laser pulse.

optech_galaxy_issue

Showing all sets of returns in the file that do not have an unique GPS time stamp because the set has one or more duplicate returns (e.g. two first returns, two second returns, … ).

With LAStools we can do this by running lassort followed by lasreturn for any LiDAR that comes from a single beam system. For LiDAR that comes from some multi-beam system, such as the Velodyne 16, 32, 64, or 128, the Optech Pegasus, the RIEGL LMS 1560 (aka “crossfire”), or the Leica ALS70 or ALS80 we first need to seperate the files into one file per beam, which can be done with lassplit.  In the following we investigate data coming from an Optech Galaxy single-beam system. First we sort the returns by GPS time stamp using lassort (this step can be omitted if the data is already sorted in acquisition order (aka by increasing GPS time stamps)) and then we check the return numbering with lasreturn:

lassort -i L001-1-M01-S1-C1_r.laz -gps_time -odix _sorted -olaz

lasreturn -i L001-1-M01-S1-C1_r_sorted.laz -check_return_numbering
checked returns of 11809046 multi and 8585573 single return pulses. took 26.278 secs
missing: 0 duplicate: 560717 too large: 0 zero: 0
duplicate
========
200543 returns with n = 1 and r = 1 are duplicate
80548 returns with n = 2 and r = 1 are duplicate
80548 returns with n = 2 and r = 2 are duplicate
41962 returns with n = 3 and r = 1 are duplicate
41962 returns with n = 3 and r = 2 are duplicate
41962 returns with n = 3 and r = 3 are duplicate
13753 returns with n = 4 and r = 1 are duplicate
13753 returns with n = 4 and r = 2 are duplicate
13753 returns with n = 4 and r = 3 are duplicate
13753 returns with n = 4 and r = 4 are duplicate
3636 returns with n = 5 and r = 1 are duplicate
3636 returns with n = 5 and r = 2 are duplicate
3636 returns with n = 5 and r = 3 are duplicate
3636 returns with n = 5 and r = 4 are duplicate
3636 returns with n = 5 and r = 5 are duplicate
WARNING: there are 59462 GPS time stamps that have returns with different number of returns

The output we see above indicates a problem in the return numbering. A recently added new options to lasreturn that allow to reclassify those returns that seem to be part of a problematic set of returns that either contains missing returns, duplicate returns, or returns with different values for the “numbers of returns of given pulse” attribute. This allows us to visualize the issue with lasview. All returns whose are part of a problematic set is shown in the image above.

lasreturn -i L001-1-M01-S1-C1_r_sorted.laz ^
          -check_return_numbering ^
          -classify_as 8 ^
          -classify_duplicate_as 9 ^
          -classify_violation_as 7 ^
          -odix _marked -olaz

This command will mark all sets of returns (i.e. returns that have the exact same GPS time stamp) that have missing returns as 8, that have duplicate returns as 9, and that have returns which different “number of returns per pulse” attribute as 7. The data we have here has no missing returns (no returns are classified as 8) but we have duplicate (9) and violating (7) returns. We look at them closely in single scan lines to conclude.

It immediately becomes obvious that the same GPS time stamp was assigned to the returns of pair of subsequent shots. If the subsequent shots have the same number of returns per shot they are classified as duplicate (9 or blue). If the subsequent shots have different number of returns per shot they are marked as violating (7 or violett) but the reason for the issue is the same. We can look at a few of these return sets in ASCII. Here two subsequent four return shots that have the same GPS time stamp.

237881.011730 4 1 691602.736 5878246.425 141.992 6 79
237881.011730 4 2 691602.822 5878246.415 141.173 6 89
237881.011730 4 3 691603.051 5878246.389 138.993 6 44
237881.011730 4 4 691603.350 5878246.356 136.150 6 169
237881.011730 4 1 691602.793 5878246.439 142.037 6 114
237881.011730 4 2 691602.883 5878246.429 141.185 6 96
237881.011730 4 3 691603.109 5878246.404 139.033 6 50
237881.011730 4 4 691603.414 5878246.370 136.129 6 137

Here a four return shot followed by a three return shot that have the same GPS time stamp.

237881.047753 4 1 691603.387 5878244.501 140.187 6 50
237881.047753 4 2 691603.602 5878244.476 138.141 6 114
237881.047753 4 3 691603.776 5878244.456 136.490 6 60
237881.047753 4 4 691603.957 5878244.436 134.767 6 116
237881.047753 3 1 691603.676 5878244.492 138.132 6 97
237881.047753 3 2 691603.845 5878244.473 136.534 6 90
237881.047753 3 3 691604.034 5878244.452 134.739 6 99

It appears the GPS time counter in the LMS export software did not store the GPS time with sufficient resolution to always distinguish subsequent shots. The issue was confirmed by Optech and was already fixed a few months ago.

We should point out that these double-used GPS time stamps have zero impact on the geometric quality of the point cloud or the distribution of returns. The drawback is that not all returns can easily be grouped into one unique set per laser shot and that the files are not entirely specification conform. Any software that relies on accurate and unique GPS time stamps (such as flight line alignment software) may potentially struggle as well. The bug of the twice-used GPS time stamps was a discovery that is probably of such low consequence that no user of Optech Galaxy data had noticed it in the 4 years that Galaxy had been sold … until we really really scrutinized some data from one of our clients. Optech reports that the issue has been fixed now. But there are other vendors out there with even more serious GPS time and return numbering issues … to be continued.

LASmoons: Gudrun Norstedt

Gudrun Norstedt (recipient of three LASmoons)
Forest History, Department of Forest Ecology and Management
Swedish University of Agricultural Sciences, Umeå, Sweden

Background:
Until the end of the 17th century, the vast boreal forests of the interior of northern Sweden were exclusively populated by the indigenous Sami. When settlers of Swedish and Finnish ethnicity started to move into the area, colonization was fast. Although there is still a prospering reindeer herding Sami culture in northern Sweden, the old Sami culture that dominated the boreal forest for centuries or even millenia is to a large extent forgotten.
Since each forest Sami family formerly had a number of seasonal settlements, the density of settlements must have been high. However, only very few remains are known today. In the field, old Sami settlements can be recognized through the presence of for example stone hearths, storage caches, pits for roasting pine bark, foundations of certain types of huts, reindeer pens, and fences. Researchers of the Forest History section of the Department of Forest Ecology and Management have long been surveying such remains on foot. This, however, is extremely time consuming and can only be done in limited areas. Also, the use of aerial photographs is usually difficult due to dense vegetation. Data from airborne laser scanning should be the best way to find remains of the old forest Sami culture. Previous research has shown the possibilities of using airborne laser scanning data for detecting cultural remains in the boreal forest (Jansson et al., 2009; Koivisto & Laulamaa, 2012; Risbøl et al., 2013), but no studies have aimed at detecting remains of the forest Sami culture. I want to test the possibilities of ALS in this respect.

DTM from the Krycklan catchment, showing a row of hunting pits and (larger) a tar pit.

Goal:
The goal of my study is to test the potential of using LiDAR data for detecting cultural and archaeological remains on the ground in a forest area where Sami have been known to dwell during historical times. Since the whole of Sweden is currently being scanned by the National Land Survey, this data will be included. However, the average point density of the national data is only 0,5–1 pulses/m^2. Therefore, the study will be done in an established research area, the Krycklan catchment, where a denser scanning was performed in 2015. The Krycklan data set lacks ground point classification, so I will have to perform such a classification before I can proceed to the creation of a DTM. Having tested various kind of software, I have found that LAStools seems to be the most efficient way to do the job. This, in turn, has made me aware of the importance of choosing the right methods and parameters for doing a classification that is suitable for archaeological purposes.

Data:
The data was acquired with a multi-spectral airborne LiDAR sensor, the Optech Titan, and a Micro IRS IMU, operated on an aircraft flying at a height of about 1000 m and positioning was post-processed with the TerraPos software for higher accuracy.
The average pulse density is 20 pulse/m^2.
+ About 7 000 hectares were covered by the scanning. The data is stored in 489 tiles.

LAStools processing:
1) run a series of classifications of a few selected tiles with both lasground and lasground_new with various parameters [lasground and lasground_new]
2) test the outcomes by comparing it to known terrain to find out the optimal parameters for classifying this particular LiDAR point cloud for archaeological purposes.
3) extract the bare-earth of all tiles (using buffers!!!) with the best parameters [lasground or lasground_new]
4) create bare-earth terrain rasters (DTMs) and analyze the area [lasdem]
5) reclassify the airborne LiDAR data collected by the National Land Survey using various parameters to see whether it can become more suitable for revealing Sami cultural remains in a boreal forest landscape  [lasground or lasground_new]

References:
Jansson, J., Alexander, B. & Söderman, U. 2009. Laserskanning från flyg och fornlämningar i skog. Länsstyrelsen Dalarna (PDF).
Koivisto, S. & Laulamaa, V. 2012. Pistepilvessä – Metsien arkeologiset kohteet LiDAR-ilmalaserkeilausaineistoissa. Arkeologipäivät 2012 (PDF).
Risbøl, O., Bollandsås, O.M., Nesbakken, A., Ørka, H.O., Næsset, E., Gobakken, T. 2013. Interpreting cultural remains in airborne laser scanning generated digital terrain models: effects of size and shape on detection success rates. Journal of Archaeological Science 40:4688–4700.

Density and Spacing of LiDAR

Recently I worked with LiDAR from an Optech Gemini scanner in the Philippines and with LiDAR from a RIEGL Q680i scanner in Thailand. The two devices scan the terrain below with a very different pattern: the Optech uses an oscillating mirror producing zig-zag scan lines, whereas the RIEGL uses a rotating polygon producing parallel scan lines. In the following we investigate the point distribution in a small piece cut from each flightline.

First we compute an estimate for the average point density and the average point spacing with lasinfo:

D:\lastools\bin>lasinfo -i optech.laz ^
                        -nh -nv -nmm -cd
number of last returns: 2330671
covered area in square units/kilounits: 1280956/1.28
point density: all returns 2.25 last only 1.82 (per unit^2)
      spacing: all returns 0.67 last only 0.74 (in units)

The option ‘-nh’ asks lasinfo not to print the header (aka ‘no header’), the option ‘-nv’ means not to print the variable length records (aka ‘no vlr’), the option ‘-nmm’ supresses the output of minimum and maximum point values (aka ‘no min max’), and the option ‘-cd’ requests that an average point density is computed (aka ‘compute density’) from which the average point spacing is derived.

D:\lastools\bin>lasinfo -i riegl.laz ^
                        -nh -nv -nmm -cd
number of last returns: 3707217
covered area in square units/kilounits: 889816/0.89
point density: all returns 4.58 last only 4.17 (per unit^2)
      spacing: all returns 0.47 last only 0.49 (in units)

What we really want to know are pulse density and pulse spacing which we get by only counting one return from every pulse. Commonly one uses the last return, which is reported as ‘last only’ values by lasinfo. According to lasinfo the Optech and the RIEGL scan have an average pulse density of 1.82 and 4.17 [pulses per square meter] and an average pulse spacing of 0.74 and 0.49 [meters] respectively.

Single number averages do not capture anything about the actual distribution of the last returns in the swath of the scan. Let us compute density rasters with lasgrid. Because the density of the two scans is different we use 3 by 3 meter cells for the Optech, which should average 3 * 3 * 1.82 = 16.38 points per cell, and 2 by 2 meter cells for the RIEGL, which should average 2 * 2 * 4.17 = 16.68 points per cell.

lasgrid -i optech.laz -last_only ^
        -density -step 3 ^
        -odix _density3x3 -obil
lasview -i optech_density3x3.bil

lasgrid -i riegl.laz -last_only ^
        -density -step 2 ^
        -odix _density2x2 -obil
lasview -i riegl_density2x2.bil

A visual comparison of the two resulting density rasters in BIL format with lasview shows first differences in pulse distribution for the two scanners.

Using 'lasview' to inspect a BIL density raster: Elevation and colors show number of points per 3 by 3 meter cell for Optech (top) and per 2 by 2 meter cell for RIEGL (bottom).

Using ‘lasview’ to inspect a BIL density raster: Elevations and colors illustrate the number of last returns per 3 by 3 meter cell for Optech (top) and per 2 by 2 meter cell for RIEGL (bottom). The images use differently scaled color ramps.

It is noticable that the number of last returns per cell increases at the edges of the swath for the Optech whereas it decreases for the RIEGL. For the Optech it is higher due to the slowing down of the oscillating mirror and the reversing of scan direction when reaching either side of the scanline. This decreases the distances between pulses at the edges of the zig-zag scan and increases the number of last returns falling into cells there. For the rotating polygon scan of the RIEGL these pulse distances are more uniform. Here the numbers are lower because many cells along the edges of the swath are only partly covered by the scan and therefore receive fewer last returns.

lasgrid -i optech.laz -last_only ^
        -density -step 3 ^
        -false -set_min_max 0 20 ^
        -odix _density3x3 -opng

lasgrid -i riegl.laz -last_only ^
        -density -step 2 ^
        -false -set_min_max 0 20 ^
        -odix _density2x2 -opng

Here another visualization of the pulse distribution by letting lasgrid false-color density rasters to a fixed range of 0 to 20. This range is based on the near identical expected values of around 16.38 and 16.68 last returns per cell based on the average densities that lasinfo reported and the cell sized used here.

A more quantitative check of how well distributed the pulse are is to generate histograms. You can use lasinfo with the ‘-histo z 1’ option to print a histogram of the z values of the BIL rasters and import these statistics into your favorite software to create a chart.

lasinfo -i optech_density3x3.bil ^
        -nh -nv -nmm -histo z 1
lasinfo -i riegl_density2x2.bil ^
        -nh -nv -nmm -histo z 1

Both distributions have their peaks at the expected 16 to 17 last returns per cell although the Optech distribution has a significantly wider spead. The odd two-peak distribution of the Optech scan seems puzzling at first but looking at the density image shows that this is an aliasing artifact of putting a fixed 3 by 3 meter grid over zig-zagging scanlines.

The next experiments uses the new ‘-edge_shortest’ and ‘-edge_longest’ options available in las2dem that compute for each point the length of its shortest or longest edge in a Delaunay triangulation and then rasters these lengths. Below you see the command lines to do this and generate histograms of the rastered edge lengths.

las2dem -i optech.laz -last_only ^
        -step 1.5 ^
        -edge_longest ^
        -odix _edge_longest -obil
lasinfo -i optech_edge_longest.bil ^
        -nh -nv -nmm -histo z 0.1 ^
        -o optech_edge_shortest_0_10.txt
las2dem -i optech.laz -last_only ^
        -step 1.5 ^
        -edge_shortest ^
        -odix _edge_shortest -obil
lasinfo -i optech_edge_shortest.bil ^
        -nh -nv -nmm -histo z 0.05 ^
        -o optech_edge_shortest_0_05.txt
las2dem -i riegl.laz -last_only ^
        -edge_longest ^
        -odix _edge_longest -obil
lasinfo -i riegl_edge_longest.bil ^
        -nh -nv -nmm -histo z 0.05 ^
        -o riegl_edge_longest_0_05.txt
las2dem -i riegl.laz -last_only ^
        -edge_shortest ^
        -odix _edge_shortest -obil
lasinfo -i riegl_edge_shortest.bil ^
        -nh -nv -nmm -histo z 0.05 ^
        -o riegl_edge_shortest_0_05.txt

These resulting histograms do look quite different.

The more important histograms are those of longest edge lengths. They illustrate the observed spacing between pulses showing us the maximal distance of each pulse from its surrounding pulses. A perfect pulse distribution that samples the ground evenly in all directions would have one narrow peak. The Optech scan is far from this ideal with a wide and flat peak that tells us that pulses are spaced apart anywhere from 1.2 to 2.2 meters. Visualizing the scan pattern shows that the 2.2 meter spacings happen at the tips of zig-zag and the 1.2 meter spacings at nadir. The RIEGL scan has much narrower peak with most pulses spaced apart at most 60 to 80 centimeters.

The histograms of shortest edge lengths illustrate how close pulses are spaced. Again, it helps to visualize the zig-zag scan pattern to explain the odd distribution of shortest edge lengths in the Optech scan: as we get closer to the edges of the flightline the pulse spacing along the scanline becomes increasingly dense. This is represented on the left side in the histogram that is slowly leveling off to zero. Again, the RIEGL scan has a narrower peak with most pulses spaced no closer than 35 to 55 centimeters.

To confirm our findings we illustrate where the scanners produce longest or shortest edges using ranges that we find in the histograms above and create false-colored rasters:

las2dem -i optech.laz -last_only ^
        -step 1.5 -edge_longest ^
        -false -set_min_max 1 2.5 ^
        -odix _edge_longest -opng
las2dem -i optech.laz -last_only ^
        -step 1.5 -edge_shortest ^
        -false -set_min_max 0 0.7 ^
        -odix _edge_shortest -opng
las2dem -i riegl.laz -last_only ^
        -edge_longest ^
        -false -set_min_max 0.5 1.0 ^
        -odix _edge_longest -opng
las2dem -i riegl.laz -last_only ^
        -edge_shortest ^
        -false -set_min_max 0.1 0.7 ^
        -odix _edge_shortest -opng

The color-codings clearly illustrate that the Optech has a much wider pulse spacing on both edges of the flightline but also a much narrower pulse spacing. That reads contradictory but one are the “within-zig-zag” spacings and the other are the “between-zig-zag” spacings. The RIEGL has an overall much more even distribution in pulse spacings.

As a final confirmation to what causes the pulse spacings to be both widest and narrowest at the edges of the flightline for the Optech scan we scrutinizing the distribution of last returns and their triangulation visually.

Now it should be clear. At the edge of the flightline the Optech scanner has increasingly close-spaced pulses within each zig-zagging scan line but also increasingly distant-spaced pulses between subsequent pairs of zig-zagging scanlines. At the very edge of the flightline the scanner is almost sampling a “linear area” twice but leaves unsampled gaps that are twice as wide as in the center of the flightline. I believe this is one of the reasons why people “cut off” the edges of the flightlines based on the scan angle when using oscillating mirror systems. Now let’s took at the RIEGL.

There is no obvious difference in pulse distribution at the edges and the center of the flightline. The entire width of the swath seems more or less evenly sampled.

The above results show that “average point density” and “average point spacing” do not capture the whole picture. A quality check that properly verifies that a scan has the desired point / pulse density and spacing should measure the actual spacings – especially when operating a scanner with oscillating mirrors. We have done it here by computing for each last return its longest surrounding edge in the Delaunay triangulation, rasterizing these edge length, and then generating a histogram of the raster values.

 

Good follow-up reads are Dr. Ullrich’s “Impact of point distribution on information content of point clouds of airborne LiDAR” presented at ELMF 2013 and “Assessing the Information Content of LiDAR Point Clouds” from PhoWo 2013.