Completeness and Correctness of Discrete LiDAR Returns per Laser Pulse fired

Again and again we have preached about the importance of quality checking when you first get your expensive LiDAR data from the vendor or your free LiDAR data from an open data portal. The minimal quality check we usually advocate consists of lasinfo, lasvalidate, lasoverlap, and lasgrid. The information computed by these LAStools can reassure you that the data contains the right information, is specification conform, has properly aligned flight lines, and has the density distribution you expect. For deliveries or downloads in LAZ format we in addition recommend running laszip with the option ‘-check’ to find the rare file that might have gotten bit-corrupted or truncated during the transfer or the download. Today we learn about a more advanced quality check that can be done by running lassort followed by lasreturn.

For every laser shot fired there are usually between one to five discrete LiDAR returns and some full-waveform systems may even deliver up to fifteen returns. Each of these one to fifteen returns is then given the exact same GPS time stamp that corresponds to the moment in time the laser pulse was fired. By having these unique GPS time stamps we can always recover the set of returns that come from the same laser shot. This makes it possible to check completeness (are all the returns in the file) and correctness (is the returns numbering correct) for the discrete returns of each laser pulse.

optech_galaxy_issue

Showing all sets of returns in the file that do not have an unique GPS time stamp because the set has one or more duplicate returns (e.g. two first returns, two second returns, … ).

With LAStools we can do this by running lassort followed by lasreturn for any LiDAR that comes from a single beam system. For LiDAR that comes from some multi-beam system, such as the Velodyne 16, 32, 64, or 128, the Optech Pegasus, the RIEGL LMS 1560 (aka “crossfire”), or the Leica ALS70 or ALS80 we first need to seperate the files into one file per beam, which can be done with lassplit.  In the following we investigate data coming from an Optech Galaxy single-beam system. First we sort the returns by GPS time stamp using lassort (this step can be omitted if the data is already sorted in acquisition order (aka by increasing GPS time stamps)) and then we check the return numbering with lasreturn:

lassort -i L001-1-M01-S1-C1_r.laz -gps_time -odix _sorted -olaz

lasreturn -i L001-1-M01-S1-C1_r_sorted.laz -check_return_numbering
checked returns of 11809046 multi and 8585573 single return pulses. took 26.278 secs
missing: 0 duplicate: 560717 too large: 0 zero: 0
duplicate
========
200543 returns with n = 1 and r = 1 are duplicate
80548 returns with n = 2 and r = 1 are duplicate
80548 returns with n = 2 and r = 2 are duplicate
41962 returns with n = 3 and r = 1 are duplicate
41962 returns with n = 3 and r = 2 are duplicate
41962 returns with n = 3 and r = 3 are duplicate
13753 returns with n = 4 and r = 1 are duplicate
13753 returns with n = 4 and r = 2 are duplicate
13753 returns with n = 4 and r = 3 are duplicate
13753 returns with n = 4 and r = 4 are duplicate
3636 returns with n = 5 and r = 1 are duplicate
3636 returns with n = 5 and r = 2 are duplicate
3636 returns with n = 5 and r = 3 are duplicate
3636 returns with n = 5 and r = 4 are duplicate
3636 returns with n = 5 and r = 5 are duplicate
WARNING: there are 59462 GPS time stamps that have returns with different number of returns

The output we see above indicates a problem in the return numbering. A recently added new options to lasreturn that allow to reclassify those returns that seem to be part of a problematic set of returns that either contains missing returns, duplicate returns, or returns with different values for the “numbers of returns of given pulse” attribute. This allows us to visualize the issue with lasview. All returns whose are part of a problematic set is shown in the image above.

lasreturn -i L001-1-M01-S1-C1_r_sorted.laz ^
          -check_return_numbering ^
          -classify_as 8 ^
          -classify_duplicate_as 9 ^
          -classify_violation_as 7 ^
          -odix _marked -olaz

This command will mark all sets of returns (i.e. returns that have the exact same GPS time stamp) that have missing returns as 8, that have duplicate returns as 9, and that have returns which different “number of returns per pulse” attribute as 7. The data we have here has no missing returns (no returns are classified as 8) but we have duplicate (9) and violating (7) returns. We look at them closely in single scan lines to conclude.

It immediately becomes obvious that the same GPS time stamp was assigned to the returns of pair of subsequent shots. If the subsequent shots have the same number of returns per shot they are classified as duplicate (9 or blue). If the subsequent shots have different number of returns per shot they are marked as violating (7 or violett) but the reason for the issue is the same. We can look at a few of these return sets in ASCII. Here two subsequent four return shots that have the same GPS time stamp.

237881.011730 4 1 691602.736 5878246.425 141.992 6 79
237881.011730 4 2 691602.822 5878246.415 141.173 6 89
237881.011730 4 3 691603.051 5878246.389 138.993 6 44
237881.011730 4 4 691603.350 5878246.356 136.150 6 169
237881.011730 4 1 691602.793 5878246.439 142.037 6 114
237881.011730 4 2 691602.883 5878246.429 141.185 6 96
237881.011730 4 3 691603.109 5878246.404 139.033 6 50
237881.011730 4 4 691603.414 5878246.370 136.129 6 137

Here a four return shot followed by a three return shot that have the same GPS time stamp.

237881.047753 4 1 691603.387 5878244.501 140.187 6 50
237881.047753 4 2 691603.602 5878244.476 138.141 6 114
237881.047753 4 3 691603.776 5878244.456 136.490 6 60
237881.047753 4 4 691603.957 5878244.436 134.767 6 116
237881.047753 3 1 691603.676 5878244.492 138.132 6 97
237881.047753 3 2 691603.845 5878244.473 136.534 6 90
237881.047753 3 3 691604.034 5878244.452 134.739 6 99

It appears the GPS time counter in the LMS export software did not store the GPS time with sufficient resolution to always distinguish subsequent shots. The issue was confirmed by Optech and was already fixed a few months ago.

We should point out that these double-used GPS time stamps have zero impact on the geometric quality of the point cloud or the distribution of returns. The drawback is that not all returns can easily be grouped into one unique set per laser shot and that the files are not entirely specification conform. Any software that relies on accurate and unique GPS time stamps (such as flight line alignment software) may potentially struggle as well. The bug of the twice-used GPS time stamps was a discovery that is probably of such low consequence that no user of Optech Galaxy data had noticed it in the 4 years that Galaxy had been sold … until we really really scrutinized some data from one of our clients. Optech reports that the issue has been fixed now. But there are other vendors out there with even more serious GPS time and return numbering issues … to be continued.

How Many Decimal Digits for Storing Longitude & Latitude?

Recently I came across this tweet containing the image below and it made me laugh … albeit not in the original way the tweet intended. The tweet was joking that “Anyone is able to open a GeoJSON file” and included the Microsoft Word screen shot seen below as a response to someone else tweeting that “Handing in a project as @GeoJSON. Let’s see if I get the usual “I can’t open this file” even though […]”. What was funny to me was seeing longitude and latitude coordinates stored with 15 decimal digits right of the decimal point. There are many memes about “German efficiency” but few about “German accuracy” 😁. Clearly it is time for another blog post about storage resolution and positional accuracy. The last blog post came on the heels of the national open elevation release of England with insane vertical resolution.

Longitude and latitude coordinates are stored with 15 decimal digits right of the decimal points.

By default LAStools will use 7 decimal digits to store longitude and latitude coordinates to a LAS or LAZ file. But what do 15 decimal digits mean for longitude and latitude coordinates? How “accurate” are the corresponding coordinates when converted to projected  coordinates? I took the second coordinate pair [ 10.049567988755534, 53.462766759577057 ] shown in the screen shot above and converted it from longitude and latitude to the easting and northing values of the WGS 84 / UTM 32N projection that has EPSG code 32632. Before conversion we quantize these numbers to have 5 through 15 decimal digits and then record the absolute difference to the coordinate pair that uses the most digits.

Number o decimal digits for longitude and latitude coordinate and absolute difference in projected position.

The table above shows that – at least for this particular longitude and latitude coordinate pair located in Germany – that 7 digits are sufficient to store coordinates with centimeter [cm] accuracy and that 8 digits are enough to store coordinates with millimeter [mm] accuracy. Any additional digit right of the decimal point will only be necessary when we need micrometer [um] or nanometer [nm] accuracy, which is very unlikely to be the case in most geospatial applications.

This means we could remove the 7 or 8 right most digits of each number from the screenshot that was tweeted and make this GeoJSON file even smaller, faster, and easier to store, transmit, open, and read. After this post was tweeted there was a follow-up tweet suggesting to have a look at this site for a more detailed analysis of what accuracy each digit in a longitude and latitude coordinate can store.