FOSS4G 2015 – Seoul, South Korea – September 14th-19th

advertisement
DEVELOPMENT OF DATA ARCHIVING AND
DISTRIBUTION SYSTEM FOR THE PHILIPPINES' LIDAR
PROGRAM USING OBJECT STORAGE SYSTEMS
Ken Abryl Eleazar Salanio
Data Archiving and Distribution Component
PHL-LiDAR 1
kasalanio@dream.upd.edu.ph
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Outline
• Introduction
• Related Work
• Working Design
• Ceph Object Storage System
• Archiving Process Flow
• Summary
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Introduction
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Introduction – The Philippine Hazard Setting
2014 Typhoon Tracks
Image sources:
https://en.wikipedia.org/wiki/Timeline_of_the_2014_Pacific_typhoon_season
https://en.wikipedia.org/wiki/Ring_of_Fire
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Introduction – The Philippine Hazard Setting
• The Philippines is settled along the Pacific Typhoon Belt and the Ring of Fire
• It is prone to earthquakes, typhoons, and other hazards
• It is abundant in natural resources
• There is a need for mapping to assess disaster risk and accounting of natural
resources
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Introduction – The Philippine Hazard Setting
•
The Philippines’ Department of Science
and Technology (DOST) with Higher
Education Institutions (HEIs) organized
programs for mapping: PHL-LiDAR 1 and
PHL-LiDAR 2
•
These programs are an extension of the
Disaster Risk and Exposure Assessment
for Mitigation (DREAM) LiDAR program
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Introduction – The Philippine Hazard Setting
PHL-LiDAR 1
PHL-LiDAR 2
 Data Acquisition
 Agriculture
 Data Validation
 Forest
 Data Processing
 Coastal
 Training & IEC
 Energy
 Data Archiving
 Hydrology
 Flood Modeling
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Introduction – The Philippine Hazard Setting
• LiDAR mapping produces high-resolution geospatial data
• High resolution data acquisition leads to humongous data sizes
• Storage, indexing, retrieval and distribution proves a challenge
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Related Work
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Related Work - File Based Storage
• File-based storage systems are commonly used to store data
• Little setup needed
• Pervasive technology
• Complexity of directory structure increases with amount of data and processes
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Related Work - GIS-enabled RDBMS
• GIS-enabled Relational Database Management Systems or Geodatabases
• Two types of design:
o Spatial indexing is on a separate layer
o Specialized spatial columns
• Indexing overhead, especially on updates
• Scalability and query time issues
• Limited support for point cloud data
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Related Work - Combined Approach
• Combines LiDAR flat tiles, RDBMS, and distributed infrastructure
o RDBMS manages metadata
o LiDAR tiles are stored in dedicated and distributed storage
o Data processing is carried out by high-performance compute servers
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Related Work - Combined Approach
•
•
•
•
•
•
e.g. OpenTopography’s architecture
Comprised of various software and
hardware resources
Actual data sets are stored as ASPRS
LAS format on a dedicated storage
server
Metadata is stored on an IBM DB2
database
Other datasets are stored on the SDSC
Cloud platform
Processing and visualization requests
are handled by a dedicated large
memory, multiprocessor system
Image source: http://www.opentopography.org/index.php/about/systemarch2
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Related Work - Combined Approach
• While highly appealing, raises concerns:
o Cost and difficulty of infrastructure upgrade
o Internet connection speed and reliability
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Working Design
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Working Design - LiDAR Portal for Archiving and Distribution
(LiPAD)
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Working Design -LiPAD
• LiPAD, the customized GeoNode platform, handles the web user interface
• GeoServer stores small shapefiles, raster, and vector data
• Large files are tiled, named and indexed by the coordinates of each tile and stored
in Ceph
• Metadata of tiled files is indexed in LiPAD, represented by a tiled shapefile of the
Philippines
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Working Design
• Added features:
o Authentication using Active Directory
o Metadata Indexing for Ceph Objects, represented as a grid
o Tiled selection of data
o Data cart
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
LiPAD Web Interface - Tile Selection
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
LiPAD Web Interface - Data Cart
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Ceph Object Storage System
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Ceph Object Storage - What is Object Storage?
• Data is managed as objects, storage containers with a file-like interface
• Objects are retrieved by their unique ID
• Offers storage size scalability
• Replicated backups
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Ceph Object Storage – Ceph Features
• Open source
• Compatible with OpenStack and Amazon AWS
• Support for broad spectrum of programming languages
• Runs on commodity hardware
• Designed to be self-healing and self-managing
• Representational State Transfer (REST) API
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Ceph Object Storage - Ceph Features
•
Block storage or virtualized hard disks
•
Object storage accessible via
o HTTP REST
o OpenStack Swift or Amazon S3
API
o C++, Java, Python, Ruby, PHP
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Ceph Object Storage - Architecture
•
Object gateway services handles
requests applications
•
Monitor nodes ensure high-availability
•
Objects are stored inside Object
Storage Devices (OSDs)
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Archiving Process Flow
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Archiving Process Flow
•
Post-processed data is tiled into 1x1 km tiles
•
Each tile is named after the northing and easting values (EPSG:32651)
•
Tiles are uploaded to Ceph and metadata is extracted
•
Metadata is input into LiPAD
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Summary
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Summary
• There is a need for LiDAR mapping in the Philippines for hazard assessment and
natural resource accounting
• Archiving, indexing and distributing these sizable data sets prove to be a challenge
• We use a combined approach, utilizing GeoNode with GeoServer, and Ceph Object
Storage
• The setup also paves the road for migrating to a distributed computing platform
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
References
Ackermann, F., 1999. Airborne laser scanning—present status and future expectations.
ISPRS Journal of Photogrammetry and Remote Sensing 54 (2-3), 64-67.
Al-Naami, K.M., and S. Seker, L. Khan, 2014. GISQF: An Efficient Spatial Query
Processing System. 2014 IEEE 7th International Conference on Cloud Computing
(CLOUD), pp. 681-688.
Amazon Web Services, 2015. AWS | Amazon Simple Storage Service (S3) - Online Cloud
Storage for Data & Files. Retrieved July 2015, from https://aws.amazon.com/s3/
Boundless, 2015. GeoExplorer — GeoExplorer. Retrieved July 2015, from
http://suite.opengeo.org/opengeo-docs/geoexplorer/
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
References
Chen, Q., 2007. Airborne lidar data processing and information extraction.
Photogrammetric Engineering & Remote Sensing, 73(2), 91-95.
Crosby, C.J., Arrowsmith, J R., Nandigam, and V., Baru, C., 2011. A Geoinformatics
Approach To Online Access And Processing Of LIDAR Topography Data. In R. Keller and
C. Baru, Eds., Geoinformatics: Cyberinfrastructure for the Solid Earth Sciences, pp. 251265. London: Cambridge University Press.
David, N., Mallet, C., and Bretar, F., 2008. Library concept and design for lidar data
processing. GEOgraphic Object Based Image Analysis (GEOBIA) Conference, Calgary,
Canada.
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
References
Fox, A., Eichelberger, C., Hughes, J., and Lyon, S., 2013. Spatio-temporal indexing in
nonrelational distributed databases. 2013 IEEE International Conference on Big Data,
pp. 291–299.
GeoNode Development Team, 2013. About GeoNode — GeoNode 2.0 documentation.
Retrieved July 2015, from
http://docs.geonode.org/en/master/organizational/about.html#about
Inktank Storage, Inc., 2015. Welcome to Ceph — Ceph Documentation. Retrieved July
2015, from http://ceph.com/docs/master/
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
References
Isenburg, M., 2012. LASzip:Lossless compression of LiDAR Data. European LiDAR
Mapping Forum.
Jones, T., 2010. Ceph: A Linux petabyte-scale distributed file system. Retrieved July
2015, from http://www.ibm.com/developerworks/library/l-ceph/
Levine, R., 1998. NAS Advantages: A VARs View. Retrieved July 2015, from
http://www.infostor.com/index/articles/display/55961/articles/infostor/volume2/issue-4/news-analysistrends/nas-advantages-a-vars-view.html
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
References
Lewis P., McElhinney C., and McCarthy T., 2012. LiDAR data management pipeline; from
spatial database population to web-application visualization. Conference Proceedings
at Com.Geo 2012, Washington DC, USA
Mesnier, M., Ganger, G. R., and Riedel, E., August 2003. Object-Based Storage. IEEE
Communications Magazine, pp. 84–90.
Open Source Geospatial Foundation, 2014. About - GeoServer. Retrieved July 2015,
from http://geoserver.org/about/
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
References
OpenTopography, 2015. NSF OpenTopography Facility | About. Retrieved July 2015,
from http://www.opentopography.org/index.php/about/
Ramsey, P., 2013. LIDAR in PostgresSQL with PointCloud. Available online:
http://boundlessgeo.com/wp-content/uploads/2013/10/pgpointcloud-foss4-2013.pdf
San Diego Supercomputer Center, 2015. SDSC Cloud. Retrieved July 2015, from
https://cloud.sdsc.edu/hp/index.php
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
References
SwiftStack, Inc., 2015. OpenStack Swift | Enterprise Storage from SwiftStack. Retrieved
July 2015, from https://swiftstack.com/openstack-swift/
The Apache Software Foundation, 2014. Welcome to Apache™ Hadoop®!. Retrieved
July 2015, from https://hadoop.apache.org/
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Acknowledgements
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Acknowledgements
The authors would like to acknowledge the support of the Department of Science and
Technology – Philippine Council for Industry, Energy and Emerging Technology Research
and Development (DOST-PCIEERD) and the Phil-LiDAR 1 research and training staff.
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Thank you very much!
FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015
Download