LASmoons: Zak Kus

Zak Kus (recipient of three LASmoons)
Topology Enthusiast
San Francisco, USA

While LiDAR data enables a lot of research and innovation in a lot of fields, it can also be used to create unique and visceral art. Using the high resolution data available, a 3D printer, and a long tool chain, we can create a physical, 3D topological map of the San Francisco bay area that shows off both the city’s hilly geology, and its unique skyline.


Test print of San Francisco’s Golden Gate Park.


Test print of San Francisco’s Golden Gate Park.

The ultimate goal of this project is to create an accurate, unique physical map of San Francisco, and the surrounding areas, which will be given to a loved one as a birthday gift. Using the data from the 2010 ARRA-CA GoldenGate survey, we can filter and process the raw lidar data into a DEM format using LAStools, which can be converted using a python script into a “water tight” 3D printable STL file.

While the data works fairly well out of the box, it does require a lot of manual editing, to remove noise spikes, and to delineate the coast line from the water in low lng areas. Interestingly, while many sophisticated tools exist to edit STLs that could in theory be used to clean up and prepare the files at the STL stage, few are capable of even opening files with so much detailed data. Using LAStools to manually classify, and remove unwanted data is the only way to achieve the desired level of detail in the final piece.

LiDAR data provided through USGS OpenTopography, using the ARRA-CA GoldenGate 2010 survey
+ Average point density of 3.33 pts/m^2 (though denser around SF)
+ Covers 2638 km^2 in total (only a ~100 km^2 subset is used)

LAStools processing:
Remove noise [lasnoise]
2) Manually clean up shorelines and problematic structures [lasview, laslayers]
3) Combine multiple tiles (to fit 3d printer) [lasmerge]
4) Create DEMs (asc format) for external tool to process [las2dem]

Converting Rasters from inefficient ASCII XYZ to more compact LAZ or TIF Formats

The German state of Brandenburg has recently started to provide many of their basic geospatial data as open data, such as digital ortophotos in TIF and JPG formats, vertical and horizontal control points in gzipped XML format, LOD1 and LOD2 building models in zipped GML format, topographic maps from 1:10000 to 1:100000 in zipped TIF and PDF formats, cadastral data in zipped XML and TIF formats, as well as LiDAR-derived 1m DTM rasters and image-derived 1m DSM rasters both in zipped XYZ ASCII format. All this data is provided with the user-friendly license called “Datenlizenz Deutschland Namensnennung 2.0“. In this article we show how to convert the 1m DTM rasters and the 1m DSM rasters  from verbose XYZ ASCII to more compact LAZ or TIF rasters.


Four 2000 by 2000 meter tiles of the Brandenburg 1m DTM. 

One particularity about most official German and Austrian rasters (anywhere else?) is that they sample the elevations in the corners rather than in the center of each raster cell. Here a one square kilometer raster tile of 1 meter resolution will have 1001 columns by 1001 rows instead of the more familiar 1000 by 1000 layout. While this corner-based representation does have some benefits, we convert these rasters in to the more common area-based representation using new functionality recently added to lasgrid.

After downloading one sample DTM tile such as we find three files in the zip folder. Two files with meta data and license information and the actual data file, which is a 2 km by 2km corner-based raster tile called “” with 2001 columns by 2001 rows. Here is how the 4004001 lines looks:

250000.0 5886000.0 15.284
250001.0 5886000.0 15.277
250002.0 5886000.0 15.273
250003.0 5886000.0 15.275
250004.0 5886000.0 15.289
250005.0 5886000.0 15.314
251994.0 5888000.0 13.565
251995.0 5888000.0 13.567
251996.0 5888000.0 13.565
251997.0 5888000.0 13.565
251998.0 5888000.0 13.564
251999.0 5888000.0 13.564
252000.0 5888000.0 13.565

The first step is to convert these XYZ rasters to LAZ format. We do this with txt2las as shown below. In case the vertical datum is the “Deutsches Haupthoehennetz 2016” we should also add ‘-vertical_dhhn2016’ but not sure at the moment:

txt2las -i dgm\*.xyz ^
        -set_scale 1.0 1.0 0.001 ^
        -epsg 25833 ^
        -odir temp -olaz ^
        -cores 4

For 84 files this reduces the size by a factor of 31 or compresses it down to 3.2 percent of the original, namely from 8.45 GB for raw XYZ to 277 MB for LAZ. So far we have really just converted a list of x, y and z coordinates from verbose ASCII to more compact LAZ. We can easily go back to ASCII with las2txt whenever needed:

txt2las -i temp\*.laz ^
        -odir ascii -otxt ^
        -cores 4

Next we use lasgrid to convert from a corner-based raster to an area-based raster using the new option ‘-subsquare 0.2’ which replaces each input point by four points that are displaced by all possibilities of adding +/- 0.2 in x and y. We then average the exactly four points that fall into each relevant raster cell with option ‘-average’ and clip the output to the meaningful 2000 columns by 2000 rows with ‘-use_tile_size 2000’. You need to get the most recent version of LAStools to have these options.

lasgrid -i temp\*.laz ^
        -subsquare 0.25 ^
        -step 1 -average ^
        -use_tile_size 2000 ^
        -odir dgm -olaz ^
        -cores 4

Instead of RasterLAZ you can also choose the TIF, BIL, IMG, or ASC format here. The final result are standard 1 meter elevation products with 2000 columns by 2000 rows with the averaged elevation sample being associated with the center of the raster cell. The lasinforeport for a sample tile is shown at the end of this article.

You may proceed to optimize the RasterLAZ for area-of-interest queries by reordering the raster into a space-filling curve with lassort or lasoptimize and compute a spatial index. You may also classify the RasterLAZ elevation samples, for example, into building, high, medium, and low vegetation, ground, and other common classifications with lasclip or lascolor. You may also add RGB or intensity values to the RasterLAZ elevation samples using the orthophotos that are also available as open data with lascolor. These are some of the benefits of RasterLAZ beyond efficient storage and access.

We like to acknowledge the LGB (Landesvermessung und Geobasisinformation Brandenburg) for providing state-wide coverage of their geospatial data holdings as easily downloadable open data with the user-friendly Deutschland Namensnennung 2.0 license. But we also would like to ask to please add the raw LiDAR point clouds to the open data portal. The storage savings in going from ASCII XYZ to LAZ for the DTM and DSM rasters should  free enough space to host the LiDAR … (-;

lasinfo (200112) report for 'dgm_33\DGM_33250-5886.laz'
reporting all LAS header entries:
  file signature:             'LASF'
  file source ID:             0
  global_encoding:            0
  project ID GUID data 1-4:   00000000-0000-0000-0000-000000000000
  version major.minor:        1.2
  system identifier:          'raster compressed as LAZ points'
  generating software:        'LAStools (c) by rapidlasso GmbH'
  file creation day/year:     13/20
  header size:                227
  offset to point data:       455
  number var. length records: 2
  point data format:          0
  point data record length:   20
  number of point records:    4000000
  number of points by return: 4000000 0 0 0 0
  scale factor x y z:         0.5 0.5 0.001
  offset x y z:               200000 5800000 0
  min x y z:                  250000.5 5886000.5 13.419
  max x y z:                  251999.5 5887999.5 33.848
variable length header record 1 of 2:
  reserved             0
  user ID              'Raster LAZ'
  record ID            7113
  length after header  80
  description          'by LAStools of rapidlasso GmbH'
    ncols   2000
    nrows   2000
    llx   250000
    lly   5886000
    stepx    1
    stepy    1
    sigmaxy <not set>
variable length header record 2 of 2:
  reserved             0
  user ID              'LASF_Projection'
  record ID            34735
  length after header  40
  description          'by LAStools of rapidlasso GmbH'
    GeoKeyDirectoryTag version 1.1.0 number of keys 4
      key 1024 tiff_tag_location 0 count 1 value_offset 1 - GTModelTypeGeoKey: ModelTypeProjected
      key 3072 tiff_tag_location 0 count 1 value_offset 25833 - ProjectedCSTypeGeoKey: ETRS89 / UTM 33N
      key 3076 tiff_tag_location 0 count 1 value_offset 9001 - ProjLinearUnitsGeoKey: Linear_Meter
      key 4099 tiff_tag_location 0 count 1 value_offset 9001 - VerticalUnitsGeoKey: Linear_Meter
LASzip compression (version 3.4r3 c2 50000): POINT10 2
reporting minimum and maximum for all LAS point record entries ...
  X              100001     103999
  Y              172001     175999
  Z               13419      33848
  intensity           0          0
  return_number       1          1
  number_of_returns   1          1
  edge_of_flight_line 0          0
  scan_direction_flag 0          0
  classification      0          0
  scan_angle_rank     0          0
  user_data           0          0
  point_source_ID     0          0
number of first returns:        4000000
number of intermediate returns: 0
number of last returns:         4000000
number of single returns:       4000000
overview over number of returns of given pulse: 4000000 0 0 0 0 0 0
histogram of classification of points:
         4000000  never classified (0)

Removing Noise from Single Photon LiDAR to Generate a Smooth DTM

A while back we had a first look at the Single Photon LiDAR from Leica’s SPL100 sensor (that eventually turned out just to be an SPL99 because one beamlet or one receiver in the 10 by 10 array was broken and did not produce any returns). Today we are taking a closer look at a strategy to remove the excessive noise in the raw Single Photon LiDAR data from a “proper” SPL100 sensor (where all of the 100 beamlets are firing) that was flown in 2017 in Navarra, Spain.


Profile through original points on top of generated DTM.

The data was provided as open data by the cartography section of Navarra’s Government and is available via a simple download FTP portal. We describe the LAStools processing steps that were used to eliminate the excessive noise and to generate a smooth DTM. In the following we are using the originally released version of the data, that we obtained shortly after the portal went online that seems to be a bit more “raw” than the current files available now. One starndard quality check with lasinfo was done with:

lasinfo -i 0_raw\*.laz ^
        -cd ^
        -histo intensity 1 ^
        -histo user_data 1 ^
        -histo point_source 1 ^
        -histo gps_time 10 ^
        -odir 1_quality -odix _info -otxt

Upon inspecting the lasinfo report we suggest a few changes in how to store this Single Photon LiDAR data for more efficient hosting via an online portal. We perform these changes here before starting the actual processing. First we use the las2las call shown below to fix an error in the global encoding bits, remove an irrelevant VLR, re-scale the coordinates from millimeter to centimeters, re-offset the coordinates to nice numbers, and – what is by far the most crucial change for better compression – remap the beamlet ID stored in the ‘user data’ field as described in an earlier article.

las2las -i 0_raw\*.laz ^
        -rescale 0.01 0.01 0.01 ^
        -auto_reoffset ^
        -set_global_encoding_gps_bit 1 ^
        -remove_vlr 1 ^
        -map_user_data beamlet_ID_map.txt ^
        -odir 2_fix_rescale_reoffset_remap -olaz ^
        -cores 3

Then we use two lassort calls, one to maximize compression and one to improve spatial coherence. One lassort call rearranges the points in increasing order first based on the GPS time stamps, then breaks ties based on the user data field (that stores the beamlet ID), and finally stores the returns of every beamlet ordered by return number. We also add spatial reference information in this step. The other lassort call rearranges the points into a spatially coherent layout. It uses a Z-order sort with the granularity of 50 meter by 50 meter buckets of points. Within each bucket the point order from the prior sort is kept.

lassort -i 2_fix_rescale_reoffset_remap\*.laz ^
        -epsg 25830 ^
        -gps_time ^
        -user_data ^
        -return_number ^
        -odir 2_maximum_compression -olaz ^
        -cores 3

lassort -i 2_maximum_compression\*.laz ^
        -bucket_size 50 ^
        -odir 2_spatial_coherence -olaz ^
        -cores 3

The resulting optimized nine tiles are around 200 MB each and can be downloaded as one file here or as individual tiles here:

Now we start the usual processing workflow by tiling the data with lastile into smaller 500 meter by 500 meter tiles with a 25 meter buffer. We also set the pre-existing point classification in the data to zero as we will compute our own later.

lastile -i 2_spatial_coherence\*.laz ^
        -set_classification 0 ^
        -tile_size 500 -buffer 25 -flag_as_withheld ^
        -odir 3_buffered -o yecora.laz

We notice that a large amount of the noise has intensity values below 1000. We are still a bit puzzled where those intensity values come from and what exactly they mean in a Single Photon LiDAR system. But it works. We run las2las with a “filtered transform” to set classification of all points whose intensity value is 1000 or less to the classification code 7 (aka “noise”).

las2las -i 3_buffered\*.laz ^
        -keep_intensity_below 1000 ^
        -filtered_transform ^
        -set_classification 7 ^
        -odir 4_intensity_denoised -olaz ^
        -cores 3

We then ignore this “easy-to-identify” noise and go after the remaining one with lasnoise by ignoring classification code 7 and setting the newly identified noise to classification code 9 – not because it’s “water” (the usual meaning of class 9) but because these points are drawn with a distinct blue color when checking the result with lasview.

 lasnoise -i 4_intensity_denoised\*.laz ^
         -ignore_class 7 ^
         -step_xy 1.0 -step_z 0.2 ^
         -isolated 5 ^
         -classify_as 9 ^
         -odir 4_isolation_denoised -olaz ^
         -cores 3

Of the surviving non-noise points we then use lasthin to reclassify the point closest to the 20th elevation percentile per 50 cm by 50 cm area with classification code 8 (for all areas that have more than 5 non-noise points per 50 cm by 50 cm area. We repeat the same for every 1 meter by 1 meter area.

lasthin -i 4_isolation_denoised\*.laz ^
        -ignore_class 7 9 ^
        -step 0.5 -percentile 20 5 ^
        -classify_as 8 ^
        -odir 5_thinned_p20_050cm -olaz ^
        -cores 3

lasthin -i 5_thinned_p20_050cm\*.laz ^
        -ignore_class 7 9 ^
        -step 1.0 -percentile 20 5 ^
        -classify_as 8 ^
        -odir 5_thinned_p20_100cm -olaz ^
        -cores 3

We then perform a more agressive second noise removal step one with lasnoise using only those points with classification code 8, namely those non-noise points that were the 20th elevation percentile in either a 50 cm by 50 cm cell or a 1 meter by 1 meter cell. This can be done by ignoring classification code 0, 7, and 9. We mark those noise points as 6 so they appear orange in the point cloud with lasview.

lasnoise -i 5_thinned_p20_100cm\*.laz ^
         -ignore_class 0 7 9 ^
         -step_xy 2.0 -step_z 0.2 ^
         -isolated 1 ^
         -classify_as 6 ^
         -odir 5_thinned_p20_100cm_denoised -olaz ^
         -cores 3

The 20th elevation percentile points that survive the last noise removal are then classified into ground (2) and non-ground (1) points with lasground_new by ignoring all other points, namely those with classification codes 0, 6, 7, and 9.

lasground_new -i 5_thinned_p20_100cm_denoised\*.laz ^
              -ignore_class 0 6 7 9 ^
              -town ^
              -odir 5_tiles_ground_050cm -olaz ^
              -cores 3

These images below illustrate the steps we took. They also show that not all data was used and might give you ideas where to tweak our workflow for even better results.

Finally we raster the ground points into 1 meter Digital Terrain Model (DTM) rasters with las2dem and store the result (without buffers) to the RasterLAZ format.

las2dem -i 5_tiles_ground_050cm\*.laz ^
        -keep_class 2 ^
        -step 1.0 ^
        -use_tile_bb ^
        -odir 6_tiles_dtm_100cm -olaz ^
        -cores 3

Finally we merged all RasterLAZ tiles into one and compute the final hillshaded DTM with blast2dem.

blast2dem -i 6_tiles_dtm_100cm\*.laz -merged ^
          -step 1.0 ^
          -hillshade ^
          -o yecora_dtm_100cm.png

The hillshaded DTM that is result of the entire sequence of processing steps described above is shown below.

DTM from ground classification created with LAStools

For comparison we generate the same DTM using the originally provided classification. According to the README file the original ground points are classified with code 22 in areas of flight line overlap and as the usual code 2 elsewhere. Hence we must use both classification codes to construct the DTM. We do this analogue to the earlier processing steps with the three LAStools commands lastile, las2dem, and blast2dem below.

lastile -i 2_spatial_coherence\*.laz ^
        -tile_size 500 -buffer 25 -flag_as_withheld ^
        -odir 3_tiles_buffered_orig -o yecora.laz

las2dem -i 3_tiles_buffered_orig\*.laz ^
        -keep_class 2 22 ^
        -step 1.0 ^
        -use_tile_bb ^
        -odir 6_tiles_dtm_100cm_orig -olaz ^
        -cores 3

blast2dem -i 6_tiles_dtm_100cm_orig\*.laz -merged ^
          -step 1.0 ^
          -hillshade ^
          -o yecora_dtm_100cm_orig.png

Below the hillshaded DTM generated from the ground classification that was provided with the LiDAR when it was originally released as open data.

DTM from ground classification of originally released data.

In the meantime Andorra’s SPL data have been updated with a newer version in the open data portal. The new version of the data contains a much better ground classification that might have been improved manually as the new files now have the the string ‘cam’ instead of ‘ca’ in the file name, which probably means ‘classified automatically and manually’ instead of the original ‘classified automatically’. We decided not to switch to the new data release as it seemed less “raw” than the original release. For example there are suddenly points with GPS times and returns counts and numbers of zero in the file that seem synthetic. But we also computed the hillshaded DTM for the new release which is shown below.

DTM from ground classification of newly released data.

We thank the cartography section of Navarra’s Government for providing their LiDAR as open data. This not only allows re-purposing expensive data paid for by public taxes but also generates additional value, encourages citizen science, and provides educational opportunity and insights such as this blog article.

Another German State Goes Open LiDAR: Saxony

Finally some really good news out of Saxony. 😊 After North Rhine-Westphalia and Thuringia released the first significant amounts of open geospatial data in Germany in a one-two punch in January 2017, we now have a third German state opening their entire tax-payer-funded geospatial data holdings to the tax-paying public via a simple and very easy-to-use online download portal. Welcome to the open data party, Saxony!!!

Currently available via the online portal are the LiDAR-derived raster Digital Terrain Model (DTM) at 1 meter resolution (DGM 1m) for everything flown since 2015 and and at 2 meter resolution (DGM 2m) or 20 meter resolution (DGM 20m) for the entire state. The horizontal coordinates use UTM zone 33 with ETRS89 (aka EPSG code 25833) and the vertical coordinate uses the “Deutsche Haupthöhennetz 2016” or “DHHN2016” (aka EPSG code 7837). Also available are orthophotos at 20 cm (!!!) resolution (DOP 20cm).


Overview of current LiDAR holdings. Areas flown 2015 or later have LAS files and 1 meter rasters. Others have LiDAR as ASCII files and lower resolution rasters.

Offline – by ordering through either this online form or that online form – you can also get the 5 meter DTM and the 10 meter DTM, the raw LiDAR point clouds, LiDAR intensity rasters, hill-shaded DTM rasters, as well as the 1 meter and the 2 meter Digital Surface Model (DSM) for a small administrative fee that ranges between 25 EUR and 500 EUR depending on the effort involved.

Our immediate thought is to get a copy on the entire raw LiDAR points clouds (available as LAS 1.2 files for all  data acquired since 2015 and as ASCII text for earlier acquisitions) and find some portal willing to hosts this data online. We are already in contact with the land survey of Saxony to discuss this option and/or alternate plans.

Let’s have a look at the data. First we download four 2 km by 2 km tiles of the 1 meter DTM raster for an area surrounding the so called “Greifensteine” using the interactive map of the download portal, which are provided as simple XYZ text. Here a look at the contents of one ot these tiles:

more Greifensteine\
352000 5613999 636.26
352001 5613999 636.27
352002 5613999 636.28
352003 5613999 636.27
352004 5613999 636.24

Note that the elevation are not sampled in the center of every 1 meter by 1 meter cell but exactly on the full meter coordinate pair, which seems especially common  in German-speaking countries. Using txt2las we convert these XYZ rasters to LAZ format and add geo-referencing information for more efficient subsequent processing.

txt2las -i greifensteine\333* ^
        -set_scale 1 1 0.01 ^
        -epsg 25833 ^

Below you see that going from XYZ to LAZ reduces the amount of  data from 366 MB to 10.4 MB, meaning that the data on disk becomes over 35 times smaller. The ability of LASzip to compress elevation rasters was first noted during the search for missing airliner MH370 and resulted in our new LAZ-based compressor for height grid called DEMzip.  The resulting LAZ files now also include geo-referencing information.

384,000,000 bytes

2,684,820 333525610_dgm1.laz
2,590,516 333525612_dgm1.laz
2,853,851 333545610_dgm1.laz
2,795,430 333545612_dgm1.laz
10,924,617 bytes

Using blast2dem we then create a hill-shaded version of the 1 meter DTM in order to overlay a visual representation of the DTM onto Google Earth.

blast2dem -i greifensteine\333*_dgm1.laz ^
          -merged ^
          -step 1 ^
          -hillshade ^
          -o greifensteine.png

Below the result that nicely shows how the penetrating laser of the LiDAR allows us to strip away the forest to see interesting geological features in the bare-earth terrain.

In a second exercise we use the available RGB orthophoto images to color one of the DTM tiles and explore it using lasview. For this we download the image for the top left of the four tiles that covers the area containing the “Greifensteine” from the interactive download portal for orthophotos. As the resolution of the TIF image is 20 cm and that of the DTM is only 1 meter, we first down-sample the TIF using gdalwarp of GDAL.

gdalwarp -tr 1 1 ^
         -r cubic ^
         greifensteine\dop20c_33352_5612.tif ^

If you are not yet using GDAL today is a good day to start. It nicely complements the point cloud processing functionality of LAStools for raster inputs. Next we use lascolor to give each elevation pixel of the DTM stored in LAZ format its corresponding color from the orthophoto.

lascolor -i greifensteine\333525612_dgm1.laz ^
         -image greifensteine\dop1m_33352_5612.tif ^
         -odix _rgb -olaz

Now we can view the colored DTM in LAZ format interactively with lasview or any other LiDAR viewing software and turn on the RGB colors from the orthophoto as needed to understand the scene.

lasview -i greifensteine\333525612_dgm1_rgb.laz

We thank the “Staatsbetrieb Geobasisinformation und Vermessung Sachsen (GeoSN)” for giving us easy access to the 1 meter DTM and the 20 cm orthophoto that we have used in this article through their new open geodata portal as open data under the user-friendly license “Datenlizenz Deutschland – Namensnennung – Version 2.0.

National Open LiDAR Strategy of Latvia humiliates Germany, Austria, and other European “Closed Data” States

Latvia, officially the Republic of Latvia, is a country in the Baltic region of Northern Europe has around 2 million inhabitants, a territory of 65 thousand square kilometers and – since recently – also a fabulous open LiDAR policy. Here is a list of 65939 tiles in LAS format available for free download that cover the entire country with airborne LiDAR with a density from 4 to 6 pulses per square meters. The data is classified into ground, building, vegetation, water, low noise, and a few other classifications. It is licensed Creative Commons CC0 1.0 – meaning that you can copy, modify, and distribute the data, even for commercial purposes, all without asking permission. And there is a simple and  functional interactive download portal where you can easily download individual tiles.


Interactive open LiDAR download portal of Latvia.

We downloaded the 5 by 5 block of square kilometer tiles matching “4311-32-XX.las” for checking the quality and creating a 1m DTM and a 1m DSM raster. You can follow along after downloading the latest version of LAStools.

Quality Checking

We first run lasvalidate and lasinfo on the downloaded LAS files and then immediately compress them with laszip because multi-core processing of uncompressed LAS files will quickly overwhelm our file system, make processing I/O bound, and result in overall longer processing times with CPUs waiting idly for data to be loaded from the drives.

lasinfo -i 00_tiles_raw\*.las ^
        -compute_density ^
        -histo z 5 ^
        -histo intensity 256 ^
        -histo user_data 1 ^
        -histo scan_angle 1 ^
        -histo point_source 1 ^
        -histo gps_time 10 ^
        -odir 01_quality -odix _info -otxt ^
        -cores 3
lasvalidate -i 00_tiles_raw\*.las ^
            -no_CRS_fail ^
            -o 01_quality\report.xml

Despite already excluding a missing Coordinate Reference System (CRS) from being a reason to fail (the lasinfo reports show that the downloaded LAS files do not have any geo-referencing information) lasvalidate still reports a few failing files, but scrutinizing the resulting XML file ‘report.xml’ shows only minor issues.

Usually during laszip compression we do not alter the contents of a file, but here we also add the EPSG code 3059 for CRS “LKS92 / Latvia TM” as we turn bulky LAS files into slim LAZ files so we don’t have to specify it in all future processing steps.

laszip -i 00_tiles_raw\*.las ^
       -epsg 3059 ^
       -cores 2

Compression reduces the total size of the 25 tiles from over 4.1 GB to below 0.6 GB.

Next we use lasgrid to visualize the last return density which corresponds to the pulse density of the LiDAR survey. We map each 2 by 2 meter pixel where the last return density is 2 or less to blue and each 2 by 2 meter pixel it is 8 or more to red.

lasgrid -i 00_tiles_raw\*.laz ^
        -keep_last ^
        -step 2 ^
        -density_16bit ^
        -false -set_min_max 2 8 ^
        -odir 01_quality -odix _d_2_8 -opng ^
        -cores 3

This we follow by the mandatory lasoverlap check for flight line overlap and alignment where we map the number of overlapping swaths as well as the worst vertical difference between overlapping swaths to a color that allows for quick visual quality checking.

lasoverlap -i 00_tiles_raw\*.laz ^
           -step 2 ^
           -min_diff 0.1 -max_diff 0.2 ^
           -odir 01_quality -opng ^
           -cores 3

The results of the quality checks with lasgrid and lasoverlap are shown below.

Raster Derivative Generation

Now we use first las2dem to create a Digital Terrain Model (DTM) and a Digital Surface Model (DSM) in RasterLAZ format and then use blast2dem to create merged and hill-shaded versions of both. Because we will use on-the-fly buffering to avoid edge effects along tile boundaries we first spatially index the data using lasindex for more efficient access to the points from neighboring tiles.

lasindex -i 00_tiles_raw\*.laz ^
         -cores 3

las2dem -i 00_tiles_raw\*.laz ^
        -keep_class 2 9 ^
        -buffered 25 ^
        -step 1 ^
        -use_orig_bb ^
        -odir Latvia\02_dtm_1m -olaz ^
        -cores 3

blast2dem -i 02_dtm_1m\*.laz ^
          -merged ^
          -hillshade ^
          -step 1 ^
          -o dtm_1m.png

las2dem -i 00_tiles_raw\*.laz ^
        -drop_class 1 7 ^
        -buffered 10 ^
        -spike_free 1.5 ^
        -step 1 ^
        -use_orig_bb ^
        -odir 03_dsm_1m -olaz ^
        -cores 3

blast2dem -i 03_dsm_1m\*.laz ^
          -merged ^
          -hillshade ^
          -step 1 ^
          -o dsm_1m.png

Because the overlaid imagery does not look as nice in our new Google Earth installation, below are the DTM and DSM at versions down-sampled to 25% of their original size.

Many thanks to SunGIS from Latvia who tweeted us about the Open LiDAR after we chatted about it during the Foss4G 2019 gala dinner. Kudos to the Latvian Geospatial Information Agency (LGIA) for implementing a modern national geospatial policy that created opportunity for maximal return of investment by opening the expensive tax-payer funded LiDAR data for re-purposing and innovation without barriers. Kudos!

In Sweden, all they Wanted for Christmas was National LiDAR as Open Data

Let’s heat up some sweet, warm and spicy Glögg in celebration! They must have been good boys and girls up there in Sweden. Because “Jultomten” or simply ”Tomten” – how Sweden’s Santa Clause is called – is assuring a “God Jul” for all the Swedish LiDAR lovers this Christmas season.

Only a few weeks ago this tweet of ours had (mistakenly) included Sweden in a list of European countries that had released their national LiDAR archives as open data for public reuse over the past six years.

Turns out we were correct after all. Sweden has just opened their LiDAR data for free and unencumbered download. To get the data simply create a user account and browse to the ftp site for download as shown in the image sequence below.

The released LiDAR data was collected with a density of 1 to 2 pulses per square meter and is distributed in LASzip compressed LAZ tiles of 2500 by 2500 meters. The returns are classified into four classes: ground (2), water (9), low noise (7) and high noise (18). All items that can not be classified as any of the first four classes coded as left unclassified (1). The LAZ files do not contain CRS information, but this can easily be added with horizontal coordinates in SWERED99 TM (EPSG code 3006) and elevations in RH2000 height (EPSG code 5613).

Below a look with lasview at a 5 km by 5 km area that composed of the four tiles ‘18P001_67100_5800_25.laz‘, ‘18P001_67100_5825_25.laz‘, ‘18P001_67125_5800_25.laz‘ and ‘18P001_67125_5825_25.laz‘ with several of the different color modes available.


Some more details: The data was acquired at flying altitude of around 3000 meter with a maximum scan angle of ± 20º and a minimum side overlap of 10% between the flightlines. The laser footprint on ground is below 75 centimeters with slight variation based on the flying altitude. The laser scanning survey was performed with LiDAR instruments that can provide at least three returns from the same pulse. All LiDAR returns are preserved throughout the entire production chain.

The LiDAR data comes with the incredibly Creative Commons – CC0 license, which means that you can use, disseminate, modify and build on the data – even for commercial purposes – without any restrictions. You are free to acknowledge the source when you distribute the data further, but it is not required.

The LiDAR data will eventually cover approximately 75% of Sweden and new point clouds will continuously be added as additional scanning is performed according to the schedule shown below. The survey will be returning to scan every spot again after about 7 years.

2018-2022 LiDAR acquisition plan for Sweden

Below a lasinfo report for tile ‘18P001_67125_5825_25.laz‘. One noticeable oddity is the distribution of intensities. The histogram across all intensities with bins of size 256 shows two clearly distinct sets of intensities each with their own peak and a void of values between 3000 and 10000.

lasinfo -i 18P001_67125_5825_25.laz -cd -histo intensity 256
reporting all LAS header entries:
  file signature:             'LASF'
  file source ID:             0
  global_encoding:            1
  project ID GUID data 1-4:   00000000-0000-0000-0000-000000000000
  version major.minor:        1.2
  system identifier:          ''
  generating software:        'TerraScan'
  file creation day/year:     303/2018
  header size:                227
  offset to point data:       227
  number var. length records: 0
  point data format:          1
  point data record length:   28
  number of point records:    20670652
  number of points by return: 13947228 4610837 1712043 358397 42147
  scale factor x y z:         0.01 0.01 0.01
  offset x y z:               0 0 0
  min x y z:                  582500.00 6712500.00 64.56
  max x y z:                  584999.99 6714999.99 136.59
LASzip compression (version 3.2r2 c2 50000): POINT10 2 GPSTIME11 2
reporting minimum and maximum for all LAS point record entries ...
  X            58250000   58499999
  Y           671250000  671499999
  Z                6456      13659
  intensity          32      61406
  return_number       1          5
  number_of_returns   1          5
  edge_of_flight_line 0          1
  scan_direction_flag 0          1
  classification      1         18
  scan_angle_rank   -19         19
  user_data           0          1
  point_source_ID  1802       1804
  gps_time 222241082.251248 222676871.876191
number of first returns:        13947228
number of intermediate returns: 2110980
number of last returns:         13952166
number of single returns:       9339722
covered area in square units/kilounits: 5923232/5.92
point density: all returns 3.49 last only 2.36 (per square units)
      spacing: all returns 0.54 last only 0.65 (in units)
overview over number of returns of given pulse: 9339722 5797676 4058773 1263967 210514 0 0
histogram of classification of points:
        10888520  unclassified (1)
         9620725  ground (2)
           22695  noise (7)
          138147  water (9)
             565  Reserved for ASPRS Definition (18)
intensity histogram with bin size 256.000000
  bin [0,256) has 1753205
  bin [256,512) has 3009640
  bin [512,768) has 2240861
  bin [768,1024) has 1970696
  bin [1024,1280) has 1610647
  bin [1280,1536) has 1285858
  bin [1536,1792) has 974475
  bin [1792,2048) has 790480
  bin [2048,2304) has 996926
  bin [2304,2560) has 892755
  bin [2560,2816) has 164142
  bin [2816,3072) has 57367
  bin [3072,3328) has 18
  bin [10752,11008) has 589317
  bin [11008,11264) has 3760
  bin [11264,11520) has 99653
  bin [11520,11776) has 778739
  bin [11776,12032) has 1393569
  bin [12032,12288) has 1356850
  bin [12288,12544) has 533202
  bin [12544,12800) has 140223
  bin [12800,13056) has 16195
  bin [13056,13312) has 2319
  bin [13312,13568) has 977
  bin [13568,13824) has 765
  bin [13824,14080) has 648
  bin [14080,14336) has 289
  bin [14336,14592) has 513
  bin [14592,14848) has 383
  bin [14848,15104) has 178
  bin [15104,15360) has 526
  bin [15360,15616) has 108
  bin [15616,15872) has 263
  bin [15872,16128) has 289
  bin [16128,16384) has 69
  bin [16384,16640) has 390
  bin [16640,16896) has 51
  bin [16896,17152) has 186
  bin [17152,17408) has 239
  bin [17408,17664) has 169
  bin [17664,17920) has 58
  bin [17920,18176) has 227
  bin [18176,18432) has 169
  bin [18432,18688) has 40
  bin [18688,18944) has 401
  bin [18944,19200) has 30
  bin [19200,19456) has 411
  bin [19456,19712) has 34
  bin [19712,19968) has 34
  bin [19968,20224) has 398
  bin [20224,20480) has 24
  bin [20480,20736) has 108
  bin [20736,20992) has 267
  bin [20992,21248) has 29
  bin [21248,21504) has 318
  bin [21504,21760) has 26
  bin [21760,22016) has 59
  bin [22016,22272) has 184
  bin [22272,22528) has 52
  bin [22528,22784) has 18
  bin [22784,23040) has 116
  bin [23040,23296) has 55
  bin [23296,23552) has 89
  bin [23552,23808) has 250
  bin [23808,24064) has 24
  bin [24064,24320) has 52
  bin [24320,24576) has 14
  bin [24576,24832) has 29
  bin [24832,25088) has 71
  bin [25088,25344) has 74
  bin [25344,25600) has 2
  bin [25600,25856) has 17
  bin [25856,26112) has 2
  bin [26368,26624) has 9
  bin [26624,26880) has 1
  bin [26880,27136) has 1
  bin [27136,27392) has 1
  bin [27392,27648) has 1
  bin [27648,27904) has 3
  bin [28416,28672) has 2
  bin [29184,29440) has 4
  bin [30720,30976) has 1
  bin [30976,31232) has 2
  bin [31232,31488) has 1
  bin [32512,32768) has 1
  bin [36864,37120) has 1
  bin [58368,58624) has 1
  bin [61184,61440) has 1
  average intensity 3625.2240208968733 for 20670652 element(s)

LASmoons: Sebastian Flachmeier

Sebastian Flachmeier (recipient of three LASmoons)
UniGIS Master of Science, University of Salzburg, AUSTRIA
Bavarian Forest National Park, administration, Grafenau, GERMANY

The Bavarian Forest National Park is located in South-Eastern Germany, along the border with the Czech Republic. It has a total area of 240 km² and its elevation ranges from 600 to 1453 m. In 2002 a project called “High-Tech-Offensive Bayern” was started and a few first/last return LiDAR transects were flown to compute some forest metrics. The results showed that LiDAR has an advantage over other methods, because the laser was able to get readings from below the canopy. New full waveform scanner were developed that produced many more returns in the lower canopy. The National Park experimented with this technology in several projects and improved their algorithms for single tree detection. In 2012 the whole park was flown with full waveform and strategies for LiDAR based forest inventory for the whole National Park were developed. This is the data that is used in the following workflow description.

The whole Bavarian Forest National Park (black line), 1000 meter tiles (black dotted lines), the coverage of the recovered flight lines (light blue). In the area marked yellow within the red frame there are gaps in some of the flightlines. The corresponding imagery in Google Earth shows that this area contains a water reservoir.

Several versions of the LiDAR existed on the server of the administration that didn’t have the attributes we needed to reconstruct the original flight lines. The number of returns per pulse, the flight line IDs, and the GPS time stamps were missing. The goal was a workflow to create a LAStools workflow to convert the LiDAR from the original ASCII text files provided by the flight company into LAS or compressed LAZ files with all fields properly populated.

 ALS data flown in 2012 by Milan Geoservice GmbH 650 m above ground with overlap.
+ full waveform sensor (RIEGL 560 / Q680i S) with up to 7 returns per shot
+ total of 11.080.835.164 returns
+ in 1102 ASCII files with *.asc extension (changed to *.txt to avoid confusion with ASC raster)
+ covered area of 1.25 kilometers
+ last return density of 17.37 returns per square meter

This data is provided by the administration of Bavarian Forest National Park. The workflow was part of a Master’s thesis to get the academic degree UniGIS Master of Science at the University of Salzburg.

LAStools processing:

The LiDAR was provided as 1102 ASCII text files named ‘spur000001.txt’ to ‘spur001102.txt’ that looked like this:

more spur000001.txt
4589319.747 5436773.357 685.837 49 106 1 215248.851500
4589320.051 5436773.751 683.155 46 24 2 215248.851500
4589320.101 5436773.772 686.183 66 87 1 215248.851503

Positions 1 to 3 store the x, y, and z coordinate in meter [m]. Position 4 stores the “echo width” in 0.1 nanoseconds [ns], position 5 stores the intensity, position 6 stores the return number, position 7 stores the GPS time stamp in seconds [s] of the current GPS week. The “number of returns (of given pulse)” information is not explicitly stored and will need to be reconstructed in order, for example, to identify which returns are last returns. The conversion from ASCII text to LAZ was done with the txt2las command line shown below that incorporates these rationals:

  • Although the ASCII files list the three coordinates with millimeter resolution (three decimal digits), we store only centimeter resolution which is sufficient to capture all the precision in a typical airborne LiDAR survey.
  • After computing histograms of the “return number” and the “echo width” for all points with lasinfo and determining their maximal ranges it was decided to use point type 1 which can store up to 7 returns per shot and store the “echo width” as an additional attribute of type 3 (“unsigned short”) using “extra bytes”.
  • The conversion from GPS time stamp in GPS week time to Adjusted Standard time was done by finding out the exact week during which Milan Geoservice GmbH carried out the survey and looking up the corresponding GPS week 1698 using this online GPS time calculator.
  • Information about the Coordinate Reference System “DHDN / 3-degree Gauss-Kruger zone 4” as reported in the meta data is added in form of EPSG code 31468 to each LAS file.
txt2las -i ascii\spur*.txt ^
        -parse xyz0irt ^
        -set_scale 0.01 0.01 0.01 ^
        -week_to_adjusted 1698 ^
        -add_attribute 3 "echo width" "of returning waveform [ns]" 0.1 0 0.1 ^
        -epsg 31468 ^
        -odir spur_raw -olaz ^
        -cores 4

The 1102 ASCII files are now 1102 LAZ files. Because we switched from GPS week time to Adjusted Standard GPS time stamps we also need to set the “global encoding” flag in the LAS header from 0 to 1 (see ASPRS LAS specification). We can do this in-place (i.e. without creating another set of files) using the following lasinfo command:

lasinfo -i spur_raw\spur*.laz ^
        -nh -nv -nc ^
        -set_global_encoding 1

To reconstruct the missing flight line information we look for gaps in the sequence of GPS time stamps by computing GPS time histograms with lasinfo and bins of 10 seconds in size:

lasinfo -i spur_raw\spur*.laz -merged ^
        -histo gps_time 10 ^
        -o spur_raw_all.txt

The resulting histogram exhibits the expected gaps in the GPS time stamps that happen when the survey plane leaves the target area and turns around to approach the next flight line. The subsequent histogram entries marked in red show gaps of 120 and 90 seconds respectively.

more spur_raw_all.txt
bin [27165909.595196404,27165919.595196255) has 3878890
bin [27165919.595196255,27165929.595196106) has 4314401
bin [27165929.595196106,27165939.595195957) has 435788
bin [27166049.595194317,27166059.595194168) has 1317998
bin [27166059.595194168,27166069.595194019) has 4432534
bin [27166069.595194019,27166079.59519387) has 4261732
bin [27166239.595191486,27166249.595191337) has 3289819
bin [27166249.595191337,27166259.595191188) has 3865892
bin [27166259.595191188,27166269.595191039) has 1989794
bin [27166349.595189847,27166359.595189698) has 2539936
bin [27166359.595189698,27166369.595189549) has 3948358
bin [27166369.595189549,27166379.5951894) has 3955071

Now that we validated their existence, we use these gaps in the GPS time stamps to split the LiDAR back into the original flightlines it was collected in. Using lassplit we produce one file per flightline as follows:

lassplit -i spur_raw\spur*.laz -merged ^
         -recover_flightlines_interval 10 ^
         -odir strips_raw -o strip.laz

In the next step we repair the missing “number of returns (per pulse)” field that was not provided in the ASCII file. This can be done with lasreturn assuming that the point records in each file are sorted by increasing GPS time stamp. This happens to be true already in our case as the original ASCII files where storing the LiDAR returns in acquisition order and we have not changed this order. If the point records are not yet in this order it can be created with lassort as follows. As these strips can have many points per file it may be necessary to run the new 64 bit executables by adding ‘-cpu64’ to the command line in order to avoid running out of memory.

lassort -i strips_raw\strips*.laz ^
        -gpstime -return_number ^
        -odir strips_sorted -olaz ^
        -cores 4 -cpu64

An order sorted by GPS time stamp is necessary as lasreturn expects point records with the same GPS time stamp (i.e. returns generated by the same laser pulse) to be back to back in the input file. To ‘-repair_number_of_returns’ the tool will load all returns with the same GPS time stamp  and update the “number of returns (per pulse)” attribute of each return to the highest “return number” of the loaded set.

lasreturn -i strips_sorted\strips*.laz ^
          -repair_number_of_returns ^
          -odir strips_repaired -olaz ^
          -cores 4

In a final step we use las2las with the ‘-files_are_flightlines’ option (or short ‘-faf’) to set the “file source ID” field in the LAS header and the “point source ID” attribute of every point record in the file to the same unique value per strip. The first file in the folder will have all its field set to 1, the next file will have all its field set to 2, the next file to 3 and so on. Please do not run this on multiple cores for the time being.

las2las -i strips_repaired\strips*.laz ^
        -files_are_flightlines ^
        -odir strips_final -olaz

It’s always useful to run a final validation of the files using lasvalidate to reassure yourself and the people you will be sharing the data with that nothing funky has happened during any of these conversion steps.

lasvalidate -i strips_final\strip*.laz ^
            -o strips_final\report.xml

And it can also be useful to add an overview in SHP or KML format to the delivery that can be created with lasboundary as follows:

lasboundary -i strips_final\strip*.laz ^
            -overview -labels ^
            -o strips_final\overview.kml

The result was 89 LAZ files (each containing one complete flightline) totaling 54 GB compared to 1102 ASCII files (each containing a slice of a flightline) totaling 574 GB.

Scrutinizing LiDAR Data from Leica’s Single Photon Scanner SPL100 (aka SPL99)

We show how simple reordering and clever remapping of single photon LiDAR data can reduce file size by a whopping 50%. We also show that there is at least one Leica’s SPL100 sensor out there that should be called SPL99 because one of its 100 beamlets (the one with beamlet ID 53) does not seem to produce any data … (-:

Closeup on the returns of two beamlet shots colored by beamlet ID from 1 (blue) to 100 (red). Beamlet ID 53 is missing.

Following up on a recent discussion in the LAStools user forum we take a closer look at some of the single photon LiDAR collected with Leica’s SPL100 sensor made available as open data by the USGS in form of LASzip-compressed tiles in LAS 1.4 format of point type 6. This investigation was sparked by the curiosity of what value was stored to the “scanner channel” field that was added to the new point types 6 to 10 in the LAS 1.4 specification.

lasview -i USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120310_LAS_2018.laz ^
        -copy_scanner_channel_into_point_source ^

Visualizing this 2 bit number whose value can range from 0 to 3 for the first tile we downloaded resulted in this non-conclusive “magic eye” visualization. What do you see? A sailboat?

Visualizing the “scanner channel” field by mapping its four different values to different colors.

Jason Stoker from the USGS suggested that this is the truncated “beamlet” ID. Leica’s SPL100 sensor uses 100 beamlets rather than one or two laser beams to collect data. Storing the beamlet IDs between 1 and 100 to this 2 bit field that can only hold numbers between 0 and 3 is kind of pointless and should be avoided. LASzip switches prediction contexts based on this field resulting in slower compression speed and lower compression rates. The beamlet ID is also stored in the 8 bit “user data” field, so that we can simply zero the “scanner channel” field. To investigate this further we downloaded these nine tiles from this FTP site of the USGS:

Whenever we download LAZ files we first run laszip with the ‘-check’ option which performs a sanity check to make sure that the files are not truncated or otherwise corrupted. In our case we get nine solid reports of SUCCESS.

laszip -i USGS_LPC_SD_MORiver_Woolpert_B1_*_2018.laz -check

A visual inspection with lasview tells us that there are a number of flightlines in the data.

lasview -i USGS_LPC_SD_MORiver_Woolpert_B1_*_2018.laz ^
        -points 15000000 ^

We use las2las to extract flightline 2003 and lasinfo to produce a histogram of GPS times which we use in turn to decide on which quarter second of GPS time worth of data we want to extract again with las2las.

las2las -i USGS_LPC_SD_MORiver_Woolpert_B1_2016_*_LAS_2018.laz ^
        -merged ^
        -keep_point_source 2003 ^
        -o USGS_LPC_SD_MORiver_Woolpert_B1_ps_2002.laz

lasinfo -i USGS_LPC_SD_MORiver_Woolpert_B1_ps_2002.laz ^
        -cd ^
        -histo gps_time 1 ^
        -odix _info -otxt

las2las -i USGS_LPC_SD_MORiver_Woolpert_B1_ps_2002.laz ^
        -keep_gps_time 176475495 176475495.25 ^
        -o USGS_LPC_SD_MORiver_Woolpert_B1_gps176475495_quarter.laz

It always helps to give your LAZ files meaningful names in case you find them again two years later or so. We can nicely see the circular scanning pattern Leica’s SPL100 sensor. With lasview we measure that this single flightline has an extent of about 2000 meters on the ground. The lasinfo report shows a pulse density of around 19 last returns per square meter. We then sort the points by GPS time using lassort. This groups together all the returns that are the result of one “shot” of the laser with 100 beamlets as we can nicely see in the las2txt output below. Each group of returns has slightly below 100 points and there is one group every 0.00002 seconds. This means the SPL100 is firing once every 20 microseconds.

lassort -i USGS_LPC_SD_MORiver_Woolpert_B1_gps176475495_quarter.laz ^
        -gps_time ^
        -odix _sorted -olaz

las2txt -i USGS_LPC_SD_MORiver_Woolpert_B1_gps176475495_quarter_sorted.laz ^
        -parse tuxyz ^
        -stdout | more
176475495.000008 4 514408.78 4830989.78 487.79
176475495.000008 9 514410.38 4830987.49 487.70
176475495.000008 47 514411.49 4830987.71 487.70
        [ ... 86 lines deleted ... ]
176475495.000008 39 514408.53 4830991.81 487.80
176475495.000008 50 514407.97 4830991.69 487.80
176475495.000008 16 514409.24 4830991.46 487.85
176475495.000028 55 514413.51 4830985.79 487.61
176475495.000028 97 514411.10 4830990.03 487.74
176475495.000028 72 514411.30 4830989.53 487.74
        [ ... 82 lines deleted ... ]
176475495.000028 45 514410.30 4830986.19 487.70
176475495.000028 3 514409.15 4830987.52 487.73
176475495.000028 96 514411.81 4830985.46 487.67
176475495.000048 66 514411.35 4830985.15 487.67
176475495.000048 83 514411.59 4830984.65 487.61
176475495.000048 64 514413.09 4830983.93 487.61
        [ ... 78 lines deleted ... ]
176475495.000048 4 514407.30 4830984.82 487.70
176475495.000048 34 514408.65 4830983.01 487.70
176475495.000048 21 514408.11 4830982.90 487.70
176475495.000068 13 514408.25 4830981.13 487.66
176475495.000068 92 514410.53 4830984.23 487.68
176475495.000068 44 514407.17 4830980.88 487.67
        [ ... 80 lines deleted ... ]
176475495.000068 76 514408.67 4830984.37 487.71
176475495.000068 47 514409.23 4830980.27 487.67
176475495.000068 87 514412.11 4830981.93 487.61
176475495.000088 97 514408.80 4830982.62 487.70
176475495.000088 33 514407.24 4830980.68 487.64
176475495.000088 30 514407.36 4830981.77 487.68
[ ... ]

Now we can “play back” the returns in acquisition order. We map returns from one group to the same color in lasview with the new ‘-bin_gps_time_into_point_source 0.00002’ option (that will be available in the next LAStools release). For a slower playback we add ‘-steps 5000’. Press the ‘c’ key to switch through the coloring options. Press the ‘s’ key repeatedly to incrementally show the points. To take a step back press <SHIFT>+’s’.

lasview -i USGS_LPC_SD_MORiver_Woolpert_B1_gps176475495_quarter_sorted.laz ^
        -bin_gps_time_into_point_source 0.00002 ^
        -scale_user_data 2.5 ^
        -steps 5000 ^
        -win 1024 384

This slideshow requires JavaScript.

The last image colors the points by the values in the user data field (multiplied by 2.5), which essentially maps the beamlet IDs between 1 and 100 to a rainbow color ramp from blue to red. This tells us how the numbering of the beamlets from 1 to 100 corresponds to their layout in space. The next sequence of images takes a closer look at that.

This slideshow requires JavaScript.

From a compression point of view it makes sense to (1) zero the meaningless scanner channel, (2) order the points by GPS time stamps to groups beamlet returns together, and (3) order the points with the same time stamp by the user data field. The compression gain is enormous with the 9 tiles going from over 3 GB to under 2 GB:

337,156,981 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120300_LAS_2018.laz
331,801,150 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120310_LAS_2018.laz
358,928,274 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120320_LAS_2018.laz
328,597,628 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130300_LAS_2018.laz
355,997,013 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130310_LAS_2018.laz
360,403,079 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130320_LAS_2018.laz
355,399,781 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140300_LAS_2018.laz
354,523,659 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140310_LAS_2018.laz
357,248,968 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140320_LAS_2018.laz
  3,140,056,533 Bytes

197,641,087 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120300_LAS_2018_sorted.laz
194,750,096 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120310_LAS_2018_sorted.laz
210,013,408 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120320_LAS_2018_sorted.laz
190,687,275 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130300_LAS_2018_sorted.laz
206,447,730 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130310_LAS_2018_sorted.laz
209,580,551 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130320_LAS_2018_sorted.laz
205,827,197 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140300_LAS_2018_sorted.laz
203,808,113 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140310_LAS_2018_sorted.laz
206,789,959 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140320_LAS_2018_sorted.laz
  1,825,545,416 Bytes

Enumerating the 100 beamlets with a geometrically more coherent order would improve compression even more. Can anyone convince Leica to do this? The simple mapping of beamlet IDs shown below that arranges the beamlets into a zigzag order another huge compression gain of 15 percent. Altogether reordering and remapping lower the compressed file size by a whopping 50 percent.

Beamlet ID mapping table to improve spatial coherence.

168,876,666 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120300_LAS_2018_mapped_sorted.laz
165,241,508 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120310_LAS_2018_mapped_sorted.laz
176,524,959 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP120320_LAS_2018_mapped_sorted.laz
163,679,216 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130300_LAS_2018_mapped_sorted.laz
176,086,559 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130310_LAS_2018_mapped_sorted.laz
178,909,108 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP130320_LAS_2018_mapped_sorted.laz
174,735,634 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140300_LAS_2018_mapped_sorted.laz
171,679,105 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140310_LAS_2018_mapped_sorted.laz
174,997,090 USGS_LPC_SD_MORiver_Woolpert_B1_2016_14TNP140320_LAS_2018_mapped_sorted.laz
  1,550,729,845 Bytes

Once this is done a final space-filling sort into a Hilbert-curve or a Morton-order with lassort or lasoptimize would improve spatial coherence for efficient spatial indexing with lasindex.

Oh yes … the SPL100 was not firing on all cylinders. The beamlet ID 53 that would have mapped to 61 in our table was not present in any of the 9 tiles with 355,047,478 points that we had downloaded as the lasinfo histogram below shows.

lasinfo -i USGS_LPC_SD_MORiver_Woolpert_B1_2016_*_2018.laz -merged -histo user_data 1
lasinfo (180911) report for 9 merged files
reporting all LAS header entries:
  file signature:             'LASF'
  file source ID:             0
  global_encoding:            17
  project ID GUID data 1-4:   194774FA-35FE-4591-D484-010AFA13F6D9
  version major.minor:        1.4
  system identifier:          'Woolpert LAS'
  generating software:        'GeoCue LAS Updater'
  file creation day/year:     332/2017
  header size:                375
  offset to point data:       1376
  number var. length records: 1
  point data format:          6
  point data record length:   30
  number of point records:    0
  number of points by return: 0 0 0 0 0
  scale factor x y z:         0.01 0.01 0.01
  offset x y z:               0 0 0
  min x y z:                  512000.00 4830000.00 286.43
  max x y z:                  514999.99 4832999.99 866.81
  start of waveform data packet record: 0
  start of first extended variable length record: 0
  number of extended_variable length records: 0
  extended number of point records: 355047478
  extended number of points by return: 298476060 52480771 3929583 157365 3699 0 0 0 0 0 0 0 0 0 0
variable length header record 1 of 1:
  reserved             0
  user ID              'LASF_Projection'
  record ID            2112
  length after header  943
  description          'OGC WKT Coordinate System'
    COMPD_CS["NAD83(2011) / UTM zone 14N + NAVD88 height - Geoid12B (metre)",PROJCS["NAD83(2011) / UTM zone 14N",GEOGCS["NAD83(2011)",DATUM["NAD83_National_Spat
ial_Reference_System_2011",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","1116"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","
SG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","6343"]],VERT_CS["NAVD88 height - Geoid12B (metre)",VERT_DATUM["North American Vertica
l Datum 1988",2005,AUTHORITY["EPSG","5103"]],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Gravity-related height",UP],AUTHORITY["EPSG","5703"]]]
the header is followed by 4 user-defined bytes
LASzip compression (version 3.1r0 c3 50000): POINT14 3
reporting minimum and maximum for all LAS point record entries ...
  X            51200000   51499999
  Y           483000000  483299999
  Z               28643      86681
  intensity        3139      12341
  return_number       1          5
  number_of_returns   1          5
  edge_of_flight_line 0          0
  scan_direction_flag 0          1
  classification      1         10
  scan_angle_rank  -127        127
  user_data           1        100
  point_source_ID  1061       2005
  gps_time 176475467.194000 176496233.636563
  extended_return_number          1      5
  extended_number_of_returns      1      5
  extended_classification         1     10
  extended_scan_angle        -21167  21167
  extended_scanner_channel        0      3
number of first returns:        298476060
number of intermediate returns: 6282
number of last returns:         355000765
number of single returns:       298435629
overview over extended number of returns of given pulse: 298435629 52515017 3935373 157750 3709 0 0 0 0 0 0 0 0 0 0
histogram of classification of points:
       138382030  unclassified (1)
       207116732  ground (2)
         9233160  noise (7)
          310324  water (9)
            5232  rail (10)
 +-> flagged as withheld:  9233160
 +-> flagged as extended overlap: 226520346
user data histogram with bin size 1.000000
  bin 1 has 3448849
  bin 2 has 3468566
  bin 3 has 3721848
  bin 4 has 3376990
  bin 5 has 3757996
  bin 6 has 3479546
  bin 7 has 3799930
  bin 8 has 3766887
  bin 9 has 3448383
  bin 10 has 3966036
  bin 11 has 3232086
  bin 12 has 3686789
  bin 13 has 3763869
  bin 14 has 3847765
  bin 15 has 3659059
  bin 16 has 3666918
  bin 17 has 3427468
  bin 18 has 3375320
  bin 19 has 3222116
  bin 20 has 3598643
  bin 21 has 3108323
  bin 22 has 3553625
  bin 23 has 3782185
  bin 24 has 3577792
  bin 25 has 3063871
  bin 26 has 3451800
  bin 27 has 3518763
  bin 28 has 3845852
  bin 29 has 3366980
  bin 30 has 3797986
  bin 31 has 3623477
  bin 32 has 3606798
  bin 33 has 3762737
  bin 34 has 3861023
  bin 35 has 3821228
  bin 36 has 3738173
  bin 37 has 3902190
  bin 38 has 3726752
  bin 39 has 3910989
  bin 40 has 3771132
  bin 41 has 3718437
  bin 42 has 3609113
  bin 43 has 3339941
  bin 44 has 3003191
  bin 45 has 3697140
  bin 46 has 2329171
  bin 47 has 3398836
  bin 48 has 3511882
  bin 49 has 3719592
  bin 50 has 2995275
  bin 51 has 3673925
  bin 52 has 3535992
  bin 54 has 3799430
  bin 55 has 3613345
  bin 56 has 3761436
  bin 57 has 3296831
  bin 58 has 3810146
  bin 59 has 3768464
  bin 60 has 3520871
  bin 61 has 3833149
  bin 62 has 3639778
  bin 63 has 3623008
  bin 64 has 3581480
  bin 65 has 3663180
  bin 66 has 3661434
  bin 67 has 3684374
  bin 68 has 3723125
  bin 69 has 3552397
  bin 70 has 3554207
  bin 71 has 3535494
  bin 72 has 3621334
  bin 73 has 3633928
  bin 74 has 3631845
  bin 75 has 3526502
  bin 76 has 3605631
  bin 77 has 3452006
  bin 78 has 3796382
  bin 79 has 3731841
  bin 80 has 3683314
  bin 81 has 3806024
  bin 82 has 3749709
  bin 83 has 3808218
  bin 84 has 3634032
  bin 85 has 3631015
  bin 86 has 3712206
  bin 87 has 3627775
  bin 88 has 3674966
  bin 89 has 3231151
  bin 90 has 3780037
  bin 91 has 3621958
  bin 92 has 3623264
  bin 93 has 3853536
  bin 94 has 3623380
  bin 95 has 3418309
  bin 96 has 3374827
  bin 97 has 3464734
  bin 98 has 3562560
  bin 99 has 3078686
  bin 100 has 3426924


Estonia leads in Open LiDAR: nationwide & multi-temporal Point Clouds now Online

At the beginning of July 2018 the Baltic country of Estonia – with an area of 45 thousand square kilometers inhabited by around 1.3 million people – opened much of their geospatial data archives and is now offering easy and free download of LiDAR point clouds nationwide via a portal of the Estonian Land Board. What is even more exciting is that multi-temporal data sets flown in different years and seasons are available. Raw LiDAR point clouds collected either during a “regular flight in spring” or during a “forestry flight in summer” can be obtained for multiple years. The 1 km by 1 km tile with map sheet index 377650, for example, is available for four different LiDAR surveys carried out in spring 2011, summer 2013, spring 2015 and summer 2017. This offers incredible potential for studying temporal changes of man-made or natural environments. The screenshot sequence below shows how to navigate to the download site starting from this page.

We found out about this open data release during our hands-on workshop on LiDAR and photogrammetry point cloud processing with LAStools that was part of the UAV remote sensing summer school in Tartu, Estonia. See our calendar for upcoming events or contact us for holding a similar event at your university, agency, company, or conference. 

The LiDAR data provided on the download portal is compressed with LASzip and provided as 1km by 1km LiDAR tiles in LAZ format. You can search for these tiles via their Estonian 1:2000 map sheet indices. To find out which map sheet index corresponds to the tile you are interested in you can overlay the maps sheet indices over an online map. However, you will need to zoom in before you can see the indices as illustrated in the screenshot sequence below. Here a zoom to the map sheet indices for the area that we visited during the social event of the summer school.

One thing we noticed is that the tiles contain only a single layer of points. The overlaps between flightlines were removed which results in a more uniform point density but strips the user of the possibility to do their own flightline alignment checks with lasoverlap. Below you see the spring 2014 acquisition for the tile with map sheet index 475861 colored by classification, elevation, return type, flightline ID and intensity.

The license for the open data of the Estonian Land Board is very permissive and can be found here. Agreeing to the licence gives the licence holder the rights to use data free of charge for an unspecified term, to good purpose in accordance with law and best practice. Licence holder may produce derivatives of data, combine data with its own products or services, use data for commercial or non-commercial purposes and redistribute data. The licence holder obliges to refer to the origin of data when publishing and redistributing data. The reference must include the name of the licensor, the title of data, the age of data (or the date of data extraction).

Scotland’s LiDAR goes Open Data (too)

Following the lead of England and Wales, the Scottish LiDAR is now also open data. The implementation of such an open geospatial policy in the United Kingdom was spear-headed by the Environment Agency of England who started to make all of their LiDAR holdings available as open data. In September 2015 they opened DTM and DSM raster derivatives down to 25 cm resolution and in March 2016 also the raw point clouds went online our compressed and open LAZ format (more info here) – all with the very permissible Open Government Licence v3. This treasure cove of geospatial data was collected by the Environment Agency Geomatics own survey aircraft mainly for flood mapping purposes. The data that had been access restricted for the past 17 years of operation and was made open only after it was shown that restricting access in order to recover costs to finance future operations – a common argument for withholding tax-payer funded data – was nothing but an utter myth. This open data policy has resulted in an incredible re-use of the LiDAR and the Environment Agency has literally been propelled into the role of a “champion for open data” inspiring Wales (possibly the German states of North-Rhine Westfalia and Thuringia) and now also Scotland to open up their geospatial archives as well …

Huge LAS files available for download from the Scottish Open Data portal.

We went to the nice online portal of Scotland to download three files from the Phase II LiDAR for Scotland that are provided as uncompressed LAS files, namely LAS_NN45NE.las, LAS_NN55NE.las, and LAS_NN55NW.las, whose sizes are listed as 1.2 GB, 2.8 GB, and 4.7 GB in the screenshot above. Needless to say that it took quite some time and several restarts (using wget with option ‘-c’) to successfully download these very large LAS files.

laszip -i LAS_NN45NE.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN45NE.las -odix _mm -olaz
laszip -i LAS_NN55NE.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN55NE.las -odix _mm -olaz
laszip -i LAS_NN55NW.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN55NW.las -odix _mm -olaz

After downloading we decided to see how well these files compress with LASzip by running the six commands shown above creating LAZ files when re-scaling of coordinate resolution to centimeter (cm) and LAZ files with the original millimeter (mm) coordinate resolution (i.e. the original scale factors are 0.001 which is somewhat excessive for aerial LiDAR where the error in position per coordinate is typically between 5 cm and 20 cm). Below you see the resulting file sizes for the three different files.

 1,164,141,247 LAS_NN45NE.las
   124,351,690 LAS_NN45NE_cm.laz (1 : 9.4)
   146,651,719 LAS_NN45NE_mm.laz (1 : 7.9)
 2,833,123,863 LAS_NN55NE.las
   396,521,115 LAS_NN55NE_cm.laz (1 : 7.1)
   474,767,495 LAS_NN55NE_mm.laz (1 : 6.0)
 4,664,782,671 LAS_NN55NW.las
   531,454,473 LAS_NN55NW_cm.laz (1 : 8.8)
   629,141,151 LAS_NN55NW_mm.laz (1 : 7.4)

The savings in download time and storage space of storing the LiDAR in LAZ versus LAS are sixfold to tenfold. If I was a tax payer in Scotland and if my government was hosting open data on in the Amazon cloud (i.e. paying for AWS cloud services with my taxes) I would encourage them to store their data in a more compressed format. Some more details on the data.

According to the provided meta data, the Scottish Public Sector LiDAR Phase II dataset was commissioned by the Scottish Government in response to the Flood Risk Management Act (2009). The project was managed by Sniffer and the contract was awarded to Fugro BKS. Airborne LiDAR data was collected for 66 sites (the dataset does not have full national coverage) totaling 3,516 km^2 between 29th November 2012 and 18th April 2014. The point density was a minimum of 1 point/sqm, and approximately 2 points/sqm on average. A DTM and DSM were produced from the point clouds, with 1m spatial resolution. The Coordinate reference system is OSGB 1936 / British National Grid (EPSG code 27700). The data is licensed under an Open Government Licence. However, under the use constraints section it now only states that the following attribution statement must be used to acknowledge the source of the information: “Copyright Scottish Government and SEPA (2014)” but also that Fugro retain the commercial copyright, which is somewhat disconcerting and may require more clarification. According to this tweet a lesser license (NCGL) applies to the raw LiDAR point clouds. Below a lasinfo report for the large LAS_NN55NW.las as well as several visualizations with lasview.

lasinfo (170915) report for LAS_NN55NW.las
reporting all LAS header entries:
 file signature: 'LASF'
 file source ID: 0
 global_encoding: 1
 project ID GUID data 1-4: 00000000-0000-0000-0000-000000000000
 version major.minor: 1.2
 system identifier: 'Riegl LMS-Q'
 generating software: 'Fugro LAS Processor'
 file creation day/year: 343/2016
 header size: 227
 offset to point data: 227
 number var. length records: 0
 point data format: 1
 point data record length: 28
 number of point records: 166599373
 number of points by return: 149685204 14102522 2531075 280572 0
 scale factor x y z: 0.001 0.001 0.001
 offset x y z: 250050 755050 270
 min x y z: 250000.000 755000.000 203.731
 max x y z: 254999.999 759999.999 491.901
reporting minimum and maximum for all LAS point record entries ...
 X -50000 4949999
 Y -50000 4949999
 Z -66269 221901
 intensity 39 2046
 return_number 1 4
 number_of_returns 1 4
 edge_of_flight_line 0 1
 scan_direction_flag 1 1
 classification 1 11
 scan_angle_rank -30 30
 user_data 0 3
 point_source_ID 66 91
 gps_time 38230669.389034 38402435.753789
number of first returns: 149685204
number of intermediate returns: 2813604
number of last returns: 149687616
number of single returns: 135599244
overview over number of returns of given pulse: 135599244 23122229 6754118 1123782 0 0 0
histogram of classification of points:
 287819 unclassified (1)
 109019874 ground (2)
 14476880 low vegetation (3)
 3487218 medium vegetation (4)
 39141518 high vegetation (5)
 165340 building (6)
 13508 rail (10)
 7216 road surface (11)

Kudos to the Scottish government for opening their data. We hereby acknowledge the source of the LiDAR that we have used in the experiments above as “Copyright Scottish Government and SEPA (2014)”.