CyArk partners with Google, takes over “Don’t be Evil” Mantra, opens LiDAR Archive

One of our most popular (and controversial) blog articles was “Can You Copyright LiDAR“. It was written after we saw the then chief executive director at CyArk commenting “Sweeeet use of CyArk data” on an article describing the creation of a sugary fudge replica of Guatemala’s Tikal temple promoting a series of sugars by multinational agribusiness Tate & Lyle. Yet just a few months earlier our CEO’s university was instructed to take down his Web pages that – using the same data set – were demonstrating how to realize efficient 3D content delivery across the Web. CyArk told university administrators in an email that he was “[…] hosting unauthorized content from CyArk […]”. The full story is here.

Back then, the digital preservation strategy of CyArk was to keep their archaeological scans safe through their partnership with Iron Mountain. In the comment section of “Can You Copyright LiDAR” you can find several entries that are critical of this approach. But that was five years ago. Earlier this year and just after Google removed the “Don’t be Evil” mantra from their code of conduct, CyArk stepped up to take it over and completely changed their tight data control policies. Through their “Open Heritage initiative” CyArk released for the first time their raw LiDAR and imagery with an open license. Here in their own words:

In 2018, CyArk launched the Open Heritage initiative, a
collaboration with Google Arts and Culture to make available
our archive to a broader audience. This was the first time
CyArk has made available primary data sets, including lidar
scans, photogrammetric imagery and corresponding metadata
in a standardized format on a self-serve platform. We are
committed to opening up our archive further as we collect
new data and publishing existing projects where permissions
allow. The data is made available for education, research
and other non-commercial uses via a a Creative Commons
Attribution-NonCommercial 4.0 International License.

This is a HUGE change from the situation in 2013 that resulted in the deletion of our CEO’s Web pages. So we went to download Guatemala’s Tikal temple – the one that got him into trouble back then. It is provided as a single E57 file called ‘Tikal.e57’ with a size of 1074 MB that contains 35,551,759 points in 118 individual scan positions. Using the e572las.exe tool that is part of LAStools we converted this into a single LAZ file ‘Tikal.laz’ with a size of 164 MB.

C:\LAStools\bin>e572las -i c:\data\Tikal\Tikal.e57 ^
                        -o c:\data\Tikal\Tikal.laz

We were not able to find information about the Coordinate Reference System (CRS), but after looking at the coordinate bounding box (see lasinfo report at the end of the article) and the set of projections covering Guatemala, one can make an educated guess that it might be UTM 16 north. Generating a false-colored highest-return 0.5 meter raster with lasgrid and loading it into Google Earth quickly confirms that this is correct.

lasgrid -i c:\data\Tikal\Tikal.laz ^
        -step 0.5 ^
        -highest ^
        -false ^
        -utm 16north ^
        -odix _elev -opng

Now we can laspublish the file with the command line below to create an interactive 3D Web portal using Potree. Unlike five years ago we should now be permitted to create an online portal without the headaches of last time. The CC BY-NC 4.0 license allows to copy and redistribute the material in any medium or format.

laspublish -i c:\data\Tikal\Tikal.laz ^
           -rgb ^
           -utm 16north ^
           -o tikal.html ^
           -title "CyArk's LiDAR Scan of Tikal" ^
           -description "35,551,759 points from 118 individual scans (licensed CC BY-NC 4.0)" ^
           -odir C:\data\Tikal\Tikal -olaz ^
           -overwrite

Below are two screenshots of the online portal that we have just created including some quick distance measurements. This is amazing data. Wow!

Looking at “Templo del Gran Jaguar” from “La Gran Plaza” after taking two measurements.

Overlooking “La Gran Plaza” out of the upper opening of “Templo del Gran Jaguar” with “Templo del las Mascaras” in the back.

We congratulate CyArk to their new Open Heritage initiative and thank them for providing easy access to the Tikal temple LiDAR scans as open data with a useful Creative Commons Attribution-NonCommercial 4.0 International license. Thank you, CyArk, for your contribution to open data and open science. Kudos!

C:\LAStools\bin>lasinfo -i c:\data\Tikal\Tikal.laz
lasinfo (181119) report for 'c:\data\Tikal\Tikal.laz'
reporting all LAS header entries:
  file signature:             'LASF'
  file source ID:             0
  global_encoding:            0
  project ID GUID data 1-4:   00000000-0000-0000-0000-000000000000
  version major.minor:        1.2
  system identifier:          'LAStools (c) by Martin Isenburg'
  generating software:        'e572las.exe (version 180919)'
  file creation day/year:     0/0
  header size:                227
  offset to point data:       227
  number var. length records: 0
  point data format:          2
  point data record length:   26
  number of point records:    35551759
  number of points by return: 35551759 0 0 0 0
  scale factor x y z:         0.001 0.001 0.001
  offset x y z:               220000 1900000 0
  min x y z:                  220854.951 1905881.781 291.967
  max x y z:                  221115.921 1906154.829 341.540
LASzip compression (version 3.2r4 c2 50000): POINT10 2 RGB12 2
reporting minimum and maximum for all LAS point record entries ...
  X              854951    1115921
  Y             5881781    6154829
  Z              291967     341540
  intensity       24832      44800
  return_number       1          1
  number_of_returns   1          1
  edge_of_flight_line 0          0
  scan_direction_flag 0          0
  classification      0          0
  scan_angle_rank     0          0
  user_data           0          0
  point_source_ID     1        118
  Color R 0 65280
        G 0 65280
        B 0 65280
number of first returns:        35551759
number of intermediate returns: 0
number of last returns:         35551759
number of single returns:       35551759
overview over number of returns of given pulse: 35551759 0 0 0 0 0 0
histogram of classification of points:
        35551759  never classified (0)

LASmoons: Sebastian Flachmeier

Sebastian Flachmeier (recipient of three LASmoons)
UniGIS Master of Science, University of Salzburg, AUSTRIA
Bavarian Forest National Park, administration, Grafenau, GERMANY

Background:
The Bavarian Forest National Park is located in South-Eastern Germany, along the border with the Czech Republic. It has a total area of 240 km² and its elevation ranges from 600 to 1453 m. In 2002 a project called “High-Tech-Offensive Bayern” was started and a few first/last return LiDAR transects were flown to compute some forest metrics. The results showed that LiDAR has an advantage over other methods, because the laser was able to get readings from below the canopy. New full waveform scanner were developed that produced many more returns in the lower canopy. The National Park experimented with this technology in several projects and improved their algorithms for single tree detection. In 2012 the whole park was flown with full waveform and strategies for LiDAR based forest inventory for the whole National Park were developed. This is the data that is used in the following workflow description.

The whole Bavarian Forest National Park (black line), 1000 meter tiles (black dotted lines), the coverage of the recovered flight lines (light blue). In the area marked yellow within the red frame there are gaps in some of the flightlines. The corresponding imagery in Google Earth shows that this area contains a water reservoir.

Goal:
Several versions of the LiDAR existed on the server of the administration that didn’t have the attributes we needed to reconstruct the original flight lines. The number of returns per pulse, the flight line IDs, and the GPS time stamps were missing. The goal was a workflow to create a LAStools workflow to convert the LiDAR from the original ASCII text files provided by the flight company into LAS or compressed LAZ files with all fields properly populated.

Data:
+
 ALS data flown in 2012 by Milan Geoservice GmbH 650 m above ground with overlap.
+ full waveform sensor (RIEGL 560 / Q680i S) with up to 7 returns per shot
+ total of 11.080.835.164 returns
+ in 1102 ASCII files with *.asc extension (changed to *.txt to avoid confusion with ASC raster)
+ covered area of 1.25 kilometers
+ last return density of 17.37 returns per square meter

This data is provided by the administration of Bavarian Forest National Park. The workflow was part of a Master’s thesis to get the academic degree UniGIS Master of Science at the University of Salzburg.

LAStools processing:

The LiDAR was provided as 1102 ASCII text files named ‘spur000001.txt’ to ‘spur001102.txt’ that looked like this:

more spur000001.txt
4589319.747 5436773.357 685.837 49 106 1 215248.851500
4589320.051 5436773.751 683.155 46 24 2 215248.851500
4589320.101 5436773.772 686.183 66 87 1 215248.851503
[…]

Positions 1 to 3 store the x, y, and z coordinate in meter [m]. Position 4 stores the “echo width” in 0.1 nanoseconds [ns], position 5 stores the intensity, position 6 stores the return number, position 7 stores the GPS time stamp in seconds [s] of the current GPS week. The “number of returns (of given pulse)” information is not explicitly stored and will need to be reconstructed in order, for example, to identify which returns are last returns. The conversion from ASCII text to LAZ was done with the txt2las command line shown below that incorporates these rationals:

  • Although the ASCII files list the three coordinates with millimeter resolution (three decimal digits), we store only centimeter resolution which is sufficient to capture all the precision in a typical airborne LiDAR survey.
  • After computing histograms of the “return number” and the “echo width” for all points with lasinfo and determining their maximal ranges it was decided to use point type 1 which can store up to 7 returns per shot and store the “echo width” as an additional attribute of type 3 (“unsigned short”) using “extra bytes”.
  • The conversion from GPS time stamp in GPS week time to Adjusted Standard time was done by finding out the exact week during which Milan Geoservice GmbH carried out the survey and looking up the corresponding GPS week 1698 using this online GPS time calculator.
  • Information about the Coordinate Reference System “DHDN / 3-degree Gauss-Kruger zone 4” as reported in the meta data is added in form of EPSG code 31468 to each LAS file.
txt2las -i ascii\spur*.txt ^
        -parse xyz0irt ^
        -set_scale 0.01 0.01 0.01 ^
        -week_to_adjusted 1698 ^
        -add_attribute 3 "echo width" "of returning waveform [ns]" 0.1 0 0.1 ^
        -epsg 31468 ^
        -odir spur_raw -olaz ^
        -cores 4

The 1102 ASCII files are now 1102 LAZ files. Because we switched from GPS week time to Adjusted Standard GPS time stamps we also need to set the “global encoding” flag in the LAS header from 0 to 1 (see ASPRS LAS specification). We can do this in-place (i.e. without creating another set of files) using the following lasinfo command:

lasinfo -i spur_raw\spur*.laz ^
        -nh -nv -nc ^
        -set_global_encoding 1

To reconstruct the missing flight line information we look for gaps in the sequence of GPS time stamps by computing GPS time histograms with lasinfo and bins of 10 seconds in size:

lasinfo -i spur_raw\spur*.laz -merged ^
        -histo gps_time 10 ^
        -o spur_raw_all.txt

The resulting histogram exhibits the expected gaps in the GPS time stamps that happen when the survey plane leaves the target area and turns around to approach the next flight line. The subsequent histogram entries marked in red show gaps of 120 and 90 seconds respectively.

more spur_raw_all.txt
[...]
bin [27165909.595196404,27165919.595196255) has 3878890
bin [27165919.595196255,27165929.595196106) has 4314401
bin [27165929.595196106,27165939.595195957) has 435788
bin [27166049.595194317,27166059.595194168) has 1317998
bin [27166059.595194168,27166069.595194019) has 4432534
bin [27166069.595194019,27166079.59519387) has 4261732
[...]
bin [27166239.595191486,27166249.595191337) has 3289819
bin [27166249.595191337,27166259.595191188) has 3865892
bin [27166259.595191188,27166269.595191039) has 1989794
bin [27166349.595189847,27166359.595189698) has 2539936
bin [27166359.595189698,27166369.595189549) has 3948358
bin [27166369.595189549,27166379.5951894) has 3955071
[...]

Now that we validated their existence, we use these gaps in the GPS time stamps to split the LiDAR back into the original flightlines it was collected in. Using lassplit we produce one file per flightline as follows:

lassplit -i spur_raw\spur*.laz -merged ^
         -recover_flightlines_interval 10 ^
         -odir strips_raw -o strip.laz

In the next step we repair the missing “number of returns (per pulse)” field that was not provided in the ASCII file. This can be done with lasreturn assuming that the point records in each file are sorted by increasing GPS time stamp. This happens to be true already in our case as the original ASCII files where storing the LiDAR returns in acquisition order and we have not changed this order. If the point records are not yet in this order it can be created with lassort as follows. As these strips can have many points per file it may be necessary to run the new 64 bit executables by adding ‘-cpu64’ to the command line in order to avoid running out of memory.

lassort -i strips_raw\strips*.laz ^
        -gpstime -return_number ^
        -odir strips_sorted -olaz ^
        -cores 4 -cpu64

An order sorted by GPS time stamp is necessary as lasreturn expects point records with the same GPS time stamp (i.e. returns generated by the same laser pulse) to be back to back in the input file. To ‘-repair_number_of_returns’ the tool will load all returns with the same GPS time stamp  and update the “number of returns (per pulse)” attribute of each return to the highest “return number” of the loaded set.

lasreturn -i strips_sorted\strips*.laz ^
          -repair_number_of_returns ^
          -odir strips_repaired -olaz ^
          -cores 4

In a final step we use las2las with the ‘-files_are_flightlines’ option (or short ‘-faf’) to set the “file source ID” field in the LAS header and the “point source ID” attribute of every point record in the file to the same unique value per strip. The first file in the folder will have all its field set to 1, the next file will have all its field set to 2, the next file to 3 and so on. Please do not run this on multiple cores for the time being.

las2las -i strips_repaired\strips*.laz ^
        -files_are_flightlines ^
        -odir strips_final -olaz

It’s always useful to run a final validation of the files using lasvalidate to reassure yourself and the people you will be sharing the data with that nothing funky has happened during any of these conversion steps.

lasvalidate -i strips_final\strip*.laz ^
            -o strips_final\report.xml

And it can also be useful to add an overview in SHP or KML format to the delivery that can be created with lasboundary as follows:

lasboundary -i strips_final\strip*.laz ^
            -overview -labels ^
            -o strips_final\overview.kml

The result was 89 LAZ files (each containing one complete flightline) totaling 54 GB compared to 1102 ASCII files (each containing a slice of a flightline) totaling 574 GB.

Estonia leads in Open LiDAR: nationwide & multi-temporal Point Clouds now Online

At the beginning of July 2018 the Baltic country of Estonia – with an area of 45 thousand square kilometers inhabited by around 1.3 million people – opened much of their geospatial data archives and is now offering easy and free download of LiDAR point clouds nationwide via a portal of the Estonian Land Board. What is even more exciting is that multi-temporal data sets flown in different years and seasons are available. Raw LiDAR point clouds collected either during a “regular flight in spring” or during a “forestry flight in summer” can be obtained for multiple years. The 1 km by 1 km tile with map sheet index 377650, for example, is available for four different LiDAR surveys carried out in spring 2011, summer 2013, spring 2015 and summer 2017. This offers incredible potential for studying temporal changes of man-made or natural environments. The screenshot sequence below shows how to navigate to the download site starting from this page.

We found out about this open data release during our hands-on workshop on LiDAR and photogrammetry point cloud processing with LAStools that was part of the UAV remote sensing summer school in Tartu, Estonia. See our calendar for upcoming events or contact us for holding a similar event at your university, agency, company, or conference. 

The LiDAR data provided on the download portal is compressed with LASzip and provided as 1km by 1km LiDAR tiles in LAZ format. You can search for these tiles via their Estonian 1:2000 map sheet indices. To find out which map sheet index corresponds to the tile you are interested in you can overlay the maps sheet indices over an online map. However, you will need to zoom in before you can see the indices as illustrated in the screenshot sequence below. Here a zoom to the map sheet indices for the area that we visited during the social event of the summer school.

One thing we noticed is that the tiles contain only a single layer of points. The overlaps between flightlines were removed which results in a more uniform point density but strips the user of the possibility to do their own flightline alignment checks with lasoverlap. Below you see the spring 2014 acquisition for the tile with map sheet index 475861 colored by classification, elevation, return type, flightline ID and intensity.

The license for the open data of the Estonian Land Board is very permissive and can be found here. Agreeing to the licence gives the licence holder the rights to use data free of charge for an unspecified term, to good purpose in accordance with law and best practice. Licence holder may produce derivatives of data, combine data with its own products or services, use data for commercial or non-commercial purposes and redistribute data. The licence holder obliges to refer to the origin of data when publishing and redistributing data. The reference must include the name of the licensor, the title of data, the age of data (or the date of data extraction).

First Look with LAStools at LiDAR from Hovermap Drone by CSIRO

Last December we had a chance to visit the team of Dr. Stefan Hrabar at CSIRO in Pullenvale near Brisbane who work on a drone LiDAR system called Hovermap. This SLAM-based system is mainly developed for the purpose of autonomous flight and exploration of GPS-denied environments such as buildings, mines and tunnels. But as the SLAM algorithm continuously self-registers the scan lines it produces a LiDAR point cloud that in itself is a nice product. We started our visit with a short test flight around the on-site tower. You can download the LiDAR data and the drone trajectory of this little survey here:

The Hovermap system is based on the Velodyne Puck Lite (VLP-16) that is much cheaper and more light-weight than many other LiDAR systems. One interesting tidbit in the Hovermap setup is that the scanner is installed such that the entire Puck is constantly rotating as you can see in this video. But  the Velodyne Puck is also known to produce somewhat “fluffy” surfaces with a thickness of a few centimeters. In a previous blog post with data from the YellowScan Surveyor system (that is also based on the Puck) we used a “median ground” surface to deal with the “fluff”. In the following we will have a look at the LiDAR data produced by Hovermap and how to process it with LAStools.

LiDAR data of CSIRO tower acquired during test flight of Hovermap system.

As always we start with a lasinfo report that computes the average density ‘-cd’ and histograms for the intensity and the GPS time:

lasinfo -i CSIRO_Tower\results.laz ^
        -cd ^
        -histo intensity 16 -histo gps_time 2 ^
        -odir CSIRO_Tower\quality -odix _info -otxt

A few excerpts of the resulting lasinfo report that you can download here are below:

lasinfo (180409) report for 'CSIRO_Tower\results.laz'
[...]
 number of point records: 16668904
 number of points by return: 0 0 0 0 0
 scale factor x y z: 0.0001 0.0001 0.0001
 offset x y z: -5.919576153930379 22.785394470724583 9.535698734939086
 min x y z: -138.6437 -125.2552 -34.1510
 max x y z: 126.8046 170.8260 53.2224
WARNING: full resolution of min_x not compatible with x_offset and x_scale_factor: -138.64370561381907
WARNING: full resolution of min_y not compatible with y_offset and y_scale_factor: -125.25518631070418
WARNING: full resolution of min_z not compatible with z_offset and z_scale_factor: -34.150966206894068
WARNING: full resolution of max_x not compatible with x_offset and x_scale_factor: 126.80455330595831
WARNING: full resolution of max_y not compatible with y_offset and y_scale_factor: 170.82597525215334
WARNING: full resolution of max_z not compatible with z_offset and z_scale_factor: -34.150966206894068
[...]
 gps_time 121.288045 302.983110
WARNING: 2 points outside of header bounding box
[...]
covered area in square units/kilounits: 51576/0.05
point density: all returns 323.19 last only 318.40 (per square units)
 spacing: all returns 0.06 last only 0.06 (in units)
WARNING: for return 1 real number of points by return is 16424496 but header entry was not set.
WARNING: for return 2 real number of points by return is 244408 but header entry was not set.
[...]
real max z larger than header max z by 0.000035
real min z smaller than header min z by 0.000035
[...]

Most of these warnings have to do with poorly chosen offset values in the LAS header that have many decimal digits instead of being nice round numbers. The points are stored with sub-millimeter resolution (scale factors of 0.0001) which is unnecessarily precise for a UAV flying a Velodyne Puck where the overall system error can be expected to be on the order of a few centimeters. Also the histogram of return numbers in the LAS header was not populated. We can fix these issues with one call to las2las:

las2las -i CSIRO_Tower\results.laz ^
        -rescale 0.01 0.01 0.01 ^
        -auto_reoffset ^
        -odix _fixed -olaz

If you create another lasinfo report on the fixed file you will see that all the warnings have gone. The file size is now also only 102 MB instead of 142 MB because centimeter coordinate compress much better than sub-millimeter coordinates.

The average density of 318 last return per square meter reported by lasinfo is not that useful for a UAV survey because it does account for the highly varying distribution of LiDAR returns in the area surveyed. With lasgrid we can get a much more clear picture of that.

lasgrid -i CSIRO_Tower\results_fixed.laz ^
        -last_only ^
        -step 0.5 -use_bb -density ^
        -false -set_min_max 0 1500 ^
        -o CSIRO_Tower\quality\density_0_1500.png

LiDAR density: blue is close to zero and red is 1500 or more last returns / sqr mtr

The red dot in the point density indicated an area with over 1500 last returns per square meter. No surprise that this is the take-off and touch-down location of the copter drone. Naturally this spot is completely over-scanned compared to the rest of the area. We can remove these points with the help of the timestamps by cutting off the start and the end of the recording.

The total recording time including take-off, flight around the tower, and touch-down was around 180 seconds or 3 minutes as the lasinfo report tells us. Note that the recorded time stamps are neither “GPS Week Time” nor “Adjusted Standard GPS Time” but an internal system time. By visualizing the trajectory of the UAV with lasview while binning the timestamps into the intensity field we can easily determine what interval of timestamps describes the actual survey flight. First we convert the drone trajectory from the textual ASCII format to the LAZ format with txt2las:

txt2las -i CSIRO_Tower\results_traj.txt ^
        -skip 1 ^
        -parse txyz ^
        -set_classification 12 ^
        -olaz

lasview -i CSIRO_Tower\results_traj.laz ^
        -bin_gps_time_into_intensity 1

Binning timestamps into intensity allows visually determining start and end of survey.

Using lasview and pressing <i> while hovering over those points of the trajectory that appear to be the survey start and end we determine visually that the timestamps between 164 to 264 correspond to the actual survey flight over the area of interest with the take-off and touch-down maneuvers excluded. We use las2las to cut out the relevant part and re-run lasgrid:

las2las -i CSIRO_Tower\results_fixed.laz ^
        -keep_gps_time 164 264 ^
        -o CSIRO_Tower\results_survey.laz

lasgrid -i CSIRO_Tower\results_survey.laz ^
        -last_only ^
        -step 0.5 -use_bb -density ^
        -false -set_min_max 0 1500 ^
        -o CSIRO_Tower\quality\density_0_1500_survey.png

LiDAR density after removing take-off and touch-down maneuvers.

The other set of point we are less interested in are those occasional hits far from the scanner that sample the area too sparsely to be useful for anything. We use lastrack to reclassify points as noise (7) that exceed a x/y distance of 50 meters from the trajectory and then use lasgrid to create another density image without the points classified as noise..

lastrack -i CSIRO_Tower\results_survey.laz ^
         -track CSIRO_Tower\results_traj.laz ^
         -classify_xy_range_between 50 1000 7 ^
         -o CSIRO_Tower\results_xy50.laz

lasgrid -i CSIRO_Tower\results_xy50.laz ^
        -last_only -keep_class 0 ^
        -step 0.5 -use_bb -density ^
        -false -set_min_max 0 1500 ^
        -o CSIRO_Tower\quality\density_0_1500_xy50.png

LiDAR density after removing returns farther than 50 m from trajectory.

We process the remaining points using a typical tile-based processing pipeline. First we run lastile to create tiling of 200 meter by 200 meter tiles with 20 buffers while dropping the noise points::

lastile -i CSIRO_Tower\results_xy50.laz ^
        -drop_class 7 ^
        -tile_size 200 -buffer 20 -flag_as_withheld ^
        -odir CSIRO_Tower\tiles_raw -o eta.laz

Because of the high sampling we expect there to be quite a few duplicate point where all three coordinate x, y, and z are identical. We remove them with a call to lasduplicate:

lasduplicate -i CSIRO_Tower\tiles_raw\*.laz ^
             -unique_xyz ^
             -odir CSIRO_Tower\tiles_unique -olaz ^
             -cores 4

This removes between 12 to 25 thousand point from each tile. Then we use lasnoise to classify isolated points as noise:

lasnoise -i CSIRO_Tower\tiles_unique\*.laz ^
         -step_xy 0.5 -step_z 0.1 -isolated 5 ^
         -odir CSIRO_Tower\tiles_denoised_temp -olaz ^
         -cores 4

Aggressive parameters assure most noise point below ground are found.

This classifies between 13 to 23 thousand point from each tile into the noise classification code 7. We use rather aggressive settings to make sure we get most of the noise points that are below the terrain. Getting a correct ground classification in the next few steps is the main concern now even if this means that many points above the terrain on wires, towers, or vegetation will also get miss-classified as noise (at least temporarily). Next we use lasthin to classify a subset of points with classification code 8 on which we will then run the ground classification. We classify each point that is closest to the 5th percentile in elevation per 25 cm by 25 cm grid cell given there are at least 20 non-noise points in a cell. We then repeat this while increasing the cell size to 50 cm by 50 cm and 100 cm by 100 cm.

lasthin -i CSIRO_Tower\tiles_denoised_temp\*.laz ^
        -ignore_class 7 ^
        -step 0.25 -percentile 5 20 -classify_as 8 ^
        -odir CSIRO_Tower\tiles_thinned_025 -olaz ^
        -cores 4

lasthin -i CSIRO_Tower\tiles_thinned_025\*.laz ^
        -ignore_class 7 ^
        -step 0.50 -percentile 5 20 -classify_as 8 ^
        -odir CSIRO_Tower\tiles_thinned_050 -olaz ^
        -cores 4

lasthin -i CSIRO_Tower\tiles_thinned_025\*.laz ^
        -ignore_class 7 ^
        -step 1.00 -percentile 5 20 -classify_as 8 ^
        -odir CSIRO_Tower\tiles_thinned_100 -olaz ^
        -cores 4

 

Then we ground classify the points that were classified into the temporary classification code 8 in the previous step using lasground.

lasground -i CSIRO_Tower\tiles_thinned_100\*.laz ^
          -ignore_class 7 0 ^
          -town -ultra_fine ^
          -odir CSIRO_Tower\tiles_ground -olaz ^
          -cores 4

The resulting ground points are a lower envelope of the “fluffy” sampled surfaces produced by the Velodyne Puck scanner. We use lasheight to thicken the ground by moving all points between 1 cm below and 6 cm above the TIN of these “low ground” points to a temporary classification code 6 representing a “thick ground”. We also undo the overly aggressive noise classifications above the ground by setting all higher points back to classification code 1 (unclassified).

lasheight -i CSIRO_Tower\tiles_ground\*.laz ^
          -classify_between -0.01 0.06 6 ^
          -classify_above 0.06 1 ^
          -odir CSIRO_Tower\tiles_ground_thick -olaz ^
          -cores 4

Profile view for 25 centimeter wide strip of open terrain. Top: Green points are low ground. Orange points are thickened ground with 5 cm drop lines. Middle: Brown points are median ground computed from thick ground. Bottom: Comparing low ground points (in green) with median ground points (in brown).

From the “thick ground” we then compute a “median ground” using lasthin in a similar fashion as we used it before. A profile view for a 25 centimeter wide strip of open terrain illustrates the workflow of going from “low ground” via “thick ground” to “median ground” and shows the slight difference in elevation between the two.

lasthin -i CSIRO_Tower\tiles_ground_thick\*.laz ^
        -ignore_class 0 1 7 ^
        -step 0.25 -percentile 50 10 -classify_as 2 ^
        -odir CSIRO_Tower\tiles_ground_median_025 -olaz ^
        -cores 4

lasthin -i CSIRO_Tower\tiles_ground_median_025\*.laz ^
        -ignore_class 0 1 7 ^
        -step 0.50 -percentile 50 10 -classify_as 2 ^
        -odir CSIRO_Tower\tiles_ground_median_050 -olaz ^
        -cores 4

lasthin -i CSIRO_Tower\tiles_ground_median_050\*.laz ^
        -ignore_class 0 1 7 ^
        -step 1.00 -percentile 50 10 -classify_as 2 ^
        -odir CSIRO_Tower\tiles_ground_median_100 -olaz ^
        -cores 4

Then we use lasnoise once more with more conservative settings to remove the noise points that are sprinkled around the scene.

lasnoise -i CSIRO_Tower\tiles_ground_median_100\*.laz ^
         -step_xy 1.0 -step_z 1.0 -isolated 5 ^
         -odir CSIRO_Tower\tiles_denoised -olaz ^
         -cores 4

While we classify the scene into building roofs, vegetation, and everything else with lasclassify we also move all (unused) classifications to classification code 1 (unclassified). You may play with the parameters of lasclassify (see README) to achieve better a building classification. However, those buildings the laser can peek into (either via a window or because they are gazebo-like structures) will not be classified correctly. unless you remove the points that are under the roof somehow.

lasclassify -i CSIRO_Tower\tiles_denoised\*.laz ^
            -ignore_class 7 ^
            -change_classification_from_to 0 1 ^
            -change_classification_from_to 6 1 ^
            -step 1 ^
            -odir CSIRO_Tower\tiles_classified -olaz ^
            -cores 4

A glimpse at the final classification result is below. A hillshaded DTM and a strip of classified points. Of course the tower was miss-classified as vegetation given that it looks just like a tree to the logic used in lasclassify.

The hillshaded DTM with a strip of classified points.

Finally we remove the tile buffers (that were really important for tile-based processing) with lastile:

lastile -i CSIRO_Tower\tiles_classified\*.laz ^
        -remove_buffer ^
        -odir CSIRO_Tower\tiles_final -olaz ^
        -cores 4

And publish the LiDAR point cloud as version 1.6 of Potree using laspublish:

laspublish -i CSIRO_Tower\tiles_final\*.laz ^
           -i CSIRO_Tower\results_traj.laz ^
           -only_3D -elevation -overwrite -potree16 ^
           -title "CSIRO Tower" ^
           -description "HoverMap test flight, 18 Dec 2017" ^
           -odir CSIRO_Tower\tiles_portal -o portal.html -olaz

Note that we also added the trajectory of the drone because it looks nice and gives a nice illustration of how the UAV was scanning the scene.

Via Potree we can publish and explore the final point cloud using any modern Web browser.

We would like to thank the entire team around Dr. Stefan Hrabar for taking time out of their busy schedules just a few days before Christmas.

Scotland’s LiDAR goes Open Data (too)

Following the lead of England and Wales, the Scottish LiDAR is now also open data. The implementation of such an open geospatial policy in the United Kingdom was spear-headed by the Environment Agency of England who started to make all of their LiDAR holdings available as open data. In September 2015 they opened DTM and DSM raster derivatives down to 25 cm resolution and in March 2016 also the raw point clouds went online our compressed and open LAZ format (more info here) – all with the very permissible Open Government Licence v3. This treasure cove of geospatial data was collected by the Environment Agency Geomatics own survey aircraft mainly for flood mapping purposes. The data that had been access restricted for the past 17 years of operation and was made open only after it was shown that restricting access in order to recover costs to finance future operations – a common argument for withholding tax-payer funded data – was nothing but an utter myth. This open data policy has resulted in an incredible re-use of the LiDAR and the Environment Agency has literally been propelled into the role of a “champion for open data” inspiring Wales (possibly the German states of North-Rhine Westfalia and Thuringia) and now also Scotland to open up their geospatial archives as well …

Huge LAS files available for download from the Scottish Open Data portal.

We went to the nice online portal of Scotland to download three files from the Phase II LiDAR for Scotland that are provided as uncompressed LAS files, namely LAS_NN45NE.las, LAS_NN55NE.las, and LAS_NN55NW.las, whose sizes are listed as 1.2 GB, 2.8 GB, and 4.7 GB in the screenshot above. Needless to say that it took quite some time and several restarts (using wget with option ‘-c’) to successfully download these very large LAS files.

laszip -i LAS_NN45NE.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN45NE.las -odix _mm -olaz
laszip -i LAS_NN55NE.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN55NE.las -odix _mm -olaz
laszip -i LAS_NN55NW.las -odix _cm -olaz -rescale 0.01 0.01 0.01 
laszip -i LAS_NN55NW.las -odix _mm -olaz

After downloading we decided to see how well these files compress with LASzip by running the six commands shown above creating LAZ files when re-scaling of coordinate resolution to centimeter (cm) and LAZ files with the original millimeter (mm) coordinate resolution (i.e. the original scale factors are 0.001 which is somewhat excessive for aerial LiDAR where the error in position per coordinate is typically between 5 cm and 20 cm). Below you see the resulting file sizes for the three different files.

 1,164,141,247 LAS_NN45NE.las
   124,351,690 LAS_NN45NE_cm.laz (1 : 9.4)
   146,651,719 LAS_NN45NE_mm.laz (1 : 7.9)
 2,833,123,863 LAS_NN55NE.las
   396,521,115 LAS_NN55NE_cm.laz (1 : 7.1)
   474,767,495 LAS_NN55NE_mm.laz (1 : 6.0)
 4,664,782,671 LAS_NN55NW.las
   531,454,473 LAS_NN55NW_cm.laz (1 : 8.8)
   629,141,151 LAS_NN55NW_mm.laz (1 : 7.4)

The savings in download time and storage space of storing the LiDAR in LAZ versus LAS are sixfold to tenfold. If I was a tax payer in Scotland and if my government was hosting open data on in the Amazon cloud (i.e. paying for AWS cloud services with my taxes) I would encourage them to store their data in a more compressed format. Some more details on the data.

According to the provided meta data, the Scottish Public Sector LiDAR Phase II dataset was commissioned by the Scottish Government in response to the Flood Risk Management Act (2009). The project was managed by Sniffer and the contract was awarded to Fugro BKS. Airborne LiDAR data was collected for 66 sites (the dataset does not have full national coverage) totaling 3,516 km^2 between 29th November 2012 and 18th April 2014. The point density was a minimum of 1 point/sqm, and approximately 2 points/sqm on average. A DTM and DSM were produced from the point clouds, with 1m spatial resolution. The Coordinate reference system is OSGB 1936 / British National Grid (EPSG code 27700). The data is licensed under an Open Government Licence. However, under the use constraints section it now only states that the following attribution statement must be used to acknowledge the source of the information: “Copyright Scottish Government and SEPA (2014)” but also that Fugro retain the commercial copyright, which is somewhat disconcerting and may require more clarification. According to this tweet a lesser license (NCGL) applies to the raw LiDAR point clouds. Below a lasinfo report for the large LAS_NN55NW.las as well as several visualizations with lasview.

lasinfo (170915) report for LAS_NN55NW.las
reporting all LAS header entries:
 file signature: 'LASF'
 file source ID: 0
 global_encoding: 1
 project ID GUID data 1-4: 00000000-0000-0000-0000-000000000000
 version major.minor: 1.2
 system identifier: 'Riegl LMS-Q'
 generating software: 'Fugro LAS Processor'
 file creation day/year: 343/2016
 header size: 227
 offset to point data: 227
 number var. length records: 0
 point data format: 1
 point data record length: 28
 number of point records: 166599373
 number of points by return: 149685204 14102522 2531075 280572 0
 scale factor x y z: 0.001 0.001 0.001
 offset x y z: 250050 755050 270
 min x y z: 250000.000 755000.000 203.731
 max x y z: 254999.999 759999.999 491.901
reporting minimum and maximum for all LAS point record entries ...
 X -50000 4949999
 Y -50000 4949999
 Z -66269 221901
 intensity 39 2046
 return_number 1 4
 number_of_returns 1 4
 edge_of_flight_line 0 1
 scan_direction_flag 1 1
 classification 1 11
 scan_angle_rank -30 30
 user_data 0 3
 point_source_ID 66 91
 gps_time 38230669.389034 38402435.753789
number of first returns: 149685204
number of intermediate returns: 2813604
number of last returns: 149687616
number of single returns: 135599244
overview over number of returns of given pulse: 135599244 23122229 6754118 1123782 0 0 0
histogram of classification of points:
 287819 unclassified (1)
 109019874 ground (2)
 14476880 low vegetation (3)
 3487218 medium vegetation (4)
 39141518 high vegetation (5)
 165340 building (6)
 13508 rail (10)
 7216 road surface (11)

Kudos to the Scottish government for opening their data. We hereby acknowledge the source of the LiDAR that we have used in the experiments above as “Copyright Scottish Government and SEPA (2014)”.

LAStools Win Big at INTERGEO Taking Home Two Innovation Awards

PRESS RELEASE (for immediate release)
October 2, 2017
rapidlasso GmbH, Gilching, Germany

At INTERGEO 2017 in Berlin, rapidlasso GmbH – the makers of the popular LiDAR processing software LAStools – were awarded top honors in both of the categories they had been nominated for: most innovative software and most innovative startup. The third award for most innovative hardware went to Leica Geosystems for the BLK360 terrestrial scanner. The annual Wichman Innovation Awards have been part of INTERGEO for six years now. Already at the inaugural event in 2012 the open source LiDAR compressor LASzip of rapidlasso GmbH had been nominated, coming in as runner-up in second place.

Dr. Martin Isenburg, the founder and CEO of rapidlasso GmbH, receives the two innovation awards at the ceremony during INTERGEO 2017 in Berlin.

After receiving the two awards Dr. Martin Isenburg, the founder and CEO of rapidlasso GmbH, was quick to thank the “fun, active, and dedicated user community” of the LAStools software for their “incredible support in the online voting”. He pointed out that it was its users who make LAStools more than just an efficient software for processing point clouds. Since 2011, the community surrounding LAStools has constantly grown to several thousand users who help and motivate each other in designing workflows and in solving format issues and processing challenges. They are an integral part of what makes these tools so valuable, so Dr. Isenburg.

About rapidlasso GmbH:
Technology powerhouse rapidlasso GmbH specializes in efficient LiDAR processing tools that are widely known for their high productivity. They combine robust algorithms with efficient I/O and clever memory management to achieve high throughput for data sets containing billions of points. The company’s flagship product – the LAStools software suite – has deep market penetration and is heavily used in industry, government agencies, research labs, and educational institutions. Visit http://rapidlasso.com for more information.

Leaked: “Classified LiDAR” of Pentagon in LAS 1.4 Format

LiDAR leaks have happened! Black helicopters are in the sky!  A few days ago a tiny tweet leaked the online location of “classified LiDAR” for Washington, DC. This LiDAR really is “classified” and includes an aerial scan of the Pentagon. For rogue scientists world-wide we offer a secret download link. It links to a file code-named ‘pentagon.laz‘ that contains the 8,044,789 “classified” returns of the Pentagon shown below. This “classified file” can be deciphered by any software with native LAZ support. It was encrypted with the “LAS 1.4 compatibility mode” of LASzip. The original LAS 1.4 content was encoded into a inconspicuous-looking LAZ file. New point attributes (such as the scanner channel) were hidden as “extra bytes” for fully lossless encryption. Use ‘laszip‘ to fully decode the original “classified” LAS 1.4 file … (-;

Seriously, a tiled LiDAR data set for the District of Columbia flown in 2015 is available for anyone to use on Amazon S3 with a very permissive open data license, namely the Creative Commons Attribution 3.0 License. The LiDAR coverage can be explored via this interactive map. The tiles are provided in LAS 1.4 format and use the new point type 6. We downloaded a few tiles near the White House, the Capitol, and the Pentagon to test the “native LAS 1.4 extension” of our LASzip compressor which will be released soon (a prototype for testing is already available). As these uncompressed LAS files are YUUUGE we use the command line utility ‘wget‘ for downloading. With option ‘-c’ the download continues where it left off in case the transfer gets interrupted.

LiDAR pulse density from 20 or less (blue) to 100 or more (red) pulses per square meter.

We use lasboundary to create labeled bounding boxes for display in Google Earth and lasgrid to a create false color visualization of pulse density with the command lines shown below. Pulse densities of 20 or below are mapped to blue. Pulse densities of 100 or above are mapped to red. We picked the min value 20 and the max value 100 for this false color mapping by running lasinfo with the ‘-cd’ option to compute an average pulse density and then refining the numbers experimentally. We also use lasoverlap to visualize how flightlines overlap and how well they align. Vertical differences of up to 20 cm are mapped to white and differences of 40 cm or more are mapped to saturated blue or red.

lasboundary -i *.las ^
            -use_bb ^
            -labels ^
            -odir quality -odix _bb -okml

lasgrid -i *.las ^
        -keep_last ^
        -point_density -step 2 ^
        -false -set_min_max 20 100 ^
        -odir quality -odix _d_20_100 -opng ^
        -cores 2

lasoverlap -i *.las ^
           -min_diff 0.2 -max_diff 0.4 ^
           -odir quality -opng ^
           -cores 2

The visualization of the pulse density and of the flightline overlap both show that there is no LiDAR for the White House or Capitol Hill. We will never know how tall the tomato and kale plants had grown in Michelle Obama’s organic garden on that day. Note that the White House and Capitol Hill were not simply “cut out”. Instead the flight plan of the survey plane was carefully designed to avoid these areas. Surprisingly, the Pentagon did not receive the same treatment and is (almost) fully included in the open LiDAR as mentioned in the dramatic first paragraph. Interesting is how the varying (tidal?) water level of the Potomac River shows up in the visualization of flightline miss-alignments.

There are a number of issues in these LiDAR files. The most serious ones are reported at the very end of this article. We will now scrutinize the partly-filled tile 2016.las close to the White House with only 11,060,334 returns. A lasvalidate check immediately reports three deviations from the LAS 1.4 specification:

lasvalidate -i 2016.las -o 2016_check.xml
  1. For proper LAS 1.4 files containing point type 6 through 10 all ‘legacy’ point counts in the LAS header should be set to 0. The following six fields in the LAS header should be zero for tile 2016.las (and all other tiles):
    + legacy number of point records
    + legacy number of points by return[0]
    + legacy number of points by return[1]
    + legacy number of points by return[2]
    + legacy number of points by return[3]
    + legacy number of points by return[4]
  2. There should not be any LiDAR return in a valid LAS file whose ‘number of returns of given pulse’ attribute is zero but there are 8 such points in tile 2016.las (and many more in various other tiles).
  3. There should not be any LiDAR return whose ‘return number’ attribute is larger than their ‘number of returns of given pulse’ attribute but there are 8 such points in tile 2016.las (and many more in various other tiles).

The first issue is trivial. There is an efficient in-place fix that does not require to rewrite the entire file using lasinfo with the following command line:

lasinfo -i 2016.las ^
        -nh -nv -nc ^
        -set_number_of_point_records 0 ^
        -set_number_of_points_by_return 0 0 0 0 0 ^

A quick check with las2txt shows us that the second and third issue are caused by the same eight points. Instead of writing an 8 for the ‘number of returns’ attribute the LAS file exporter must have written a 0 (marked in red for all eight returns) and instead of writing an 8 for the ‘return number’ attribute the LAS file exporter must have written a 1 (also marked in red). We can tell it from the true first return via its z coordinate (marked in blue) as the last return should be the lowest of all.

las2txt -i 2016.las ^
        -keep_number_of_returns 0 ^
        -parse xyzrnt ^
        -stdout
397372.70 136671.62 33.02 4 0 112813299.954811
397372.03 136671.64 28.50 5 0 112813299.954811
397371.28 136671.67 23.48 6 0 112813299.954811
397370.30 136671.68 16.86 7 0 112813299.954811
397369.65 136671.70 12.50 1 0 112813299.954811
397374.37 136671.58 44.17 3 0 112813299.954811
397375.46 136671.56 51.49 1 0 112813299.954811
397374.86 136671.57 47.45 2 0 112813299.954811

With las2las we can change the ‘number of returns’ from 0 to 8 using a ‘-filtered_transform’ as illustrated in the command line below. We suspect that higher number of returns such as 9 or 10 might have been mapped to 1 and 2. Fixing those as well as repairing the wrong return numbers will require a more complex tool. We would recommend to check all tiles with more scrutiny using the lasreturn tool. But wait … more return numbering issues are to come.

las2las -i 2016.las ^
        -keep_number_of_returns 0 ^
        -filtered_transform ^
        -set_extended_number_of_returns 8 ^
        -odix _fixed -olas

A closer look at the scan pattern reveals that the LiDAR survey was flown with a dual-beam system where two laser beams scan the terrain simultaneously. This is evident in the textual representation below as there are multiple “sets of returns” for the same GPS time stamp such as 112813952.110394. We group the returns from the two beams into an orange and a green group. Their coordinates show that the two laser beams point into different directions when they are simultaneously “shot” and therefore hit the terrain far apart from another.

las2txt -i 2016.las ^
        -keep_gps_time 112813952.110392 112813952.110396 ^
        -parse xyzlurntp ^
        -stdout
397271.40 136832.35 54.31 0 0 1 1 112813952.110394 117
397277.36 136793.35 38.68 0 1 1 4 112813952.110394 117
397277.35 136793.56 32.89 0 1 2 4 112813952.110394 117
397277.34 136793.88 24.13 0 1 3 4 112813952.110394 117
397277.32 136794.25 13.66 0 1 4 4 112813952.110394 117

The information about which point is from which beam is currently stored into the generic ‘user data’ attribute instead of into the dedicated ‘scanner channel’ attribute. This can be fixed with las2las as follows.

las2las -i 2016.las ^
        -copy_user_data_into_scanner_channel ^
        -set_user_data 0 ^
        -odix _fixed -olas

Unfortunately the LiDAR files have much more serious issues in the return numbering. It’s literally a “Total Disaster!” and “Sad!” as the US president will tweet shortly. After grouping all returns with the same GPS time stamp into an orange and a green group there is one more set of returns left unaccounted for.

las2txt -i 2016.las ^
        -keep_gps_time 112813951.416451 112813951.416455 ^
        -parse xyzlurntpi ^
        -stdout
397286.02 136790.60 45.90 0 0 1 4 112813951.416453 117 24
397286.06 136791.05 39.54 0 0 2 4 112813951.416453 117 35
397286.10 136791.51 33.34 0 0 3 4 112813951.416453 117 24
397286.18 136792.41 21.11 0 0 4 4 112813951.416453 117 0
397286.12 136791.75 30.07 0 0 1 1 112813951.416453 117 47
397291.74 136750.70 45.86 0 1 1 1 112813951.416453 117 105
las2txt -i 2016.las ^
        -keep_gps_time 112813951.408708 112813951.408712 ^
        -parse xyzlurntpi ^
        -stdout
397286.01 136790.06 45.84 0 0 1 4 112813951.408710 117 7
397286.05 136790.51 39.56 0 0 2 4 112813951.408710 117 15
397286.08 136790.96 33.33 0 0 3 4 112813951.408710 117 19
397286.18 136792.16 17.05 0 0 4 4 112813951.408710 117 0
397286.11 136791.20 30.03 0 0 1 2 112813951.408710 117 58
397286.14 136791.67 23.81 0 0 2 2 112813951.408710 117 42
397291.73 136750.16 45.88 0 1 1 1 112813951.408710 117 142

This can be visualized with lasview and the result is unmistakably clear: The return numbering is messed up. There should be one shot with five returns (not a group of four and a single return) in the first example. And there should be one shot with six returns (not a group of four and a group of two returns) in the second example. Such a broken return numbering results in extra first (or last) returns. These are serious issues that affect any algorithm that relies on the return numbering such as first-return DSM generation or canopy cover computation. Those extra returns will also make the pulse density appear higher and the pulse spacing appear tighter than they really are. The numbers from 20 (blue) to 100 (red) pulses per square meters in our earlier visualization are definitely inflated.

lasview -i 2016.las ^
        -keep_gps_time 112813951.416451 112813951.416455 ^
        -color_by_return

lasview -i 2016.las ^
        -keep_gps_time 112813951.408708 112813951.408712 ^
        -color_by_return

After all these troubles here something nice. Side-by-side a first-return TIN and a spike-free TIN (using a freeze of 0.8 m) of the center court cafe in the Pentagon. Especially given all these “fake first returns” in the Washington DC LiDAR we really need the spike-free algorithm to finally “Make a DSM great again!” … (-;

We would like to acknowledge the District of Columbia Office of the Chief Technology Officer (OCTO) for providing this data with a very permissive open data license, namely the Creative Commons Attribution 3.0 License.