DEVELOPMENT OF DATA ARCHIVING AND DISTRIBUTION SYSTEM FOR THE PHILIPPINES' LIDAR PROGRAM USING OBJECT STORAGE SYSTEMS Ken Abryl Eleazar Salanio Data Archiving and Distribution Component PHL-LiDAR 1 kasalanio@dream.upd.edu.ph FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Outline • Introduction • Related Work • Working Design • Ceph Object Storage System • Archiving Process Flow • Summary FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Introduction FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Introduction – The Philippine Hazard Setting 2014 Typhoon Tracks Image sources: https://en.wikipedia.org/wiki/Timeline_of_the_2014_Pacific_typhoon_season https://en.wikipedia.org/wiki/Ring_of_Fire FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Introduction – The Philippine Hazard Setting • The Philippines is settled along the Pacific Typhoon Belt and the Ring of Fire • It is prone to earthquakes, typhoons, and other hazards • It is abundant in natural resources • There is a need for mapping to assess disaster risk and accounting of natural resources FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Introduction – The Philippine Hazard Setting • The Philippines’ Department of Science and Technology (DOST) with Higher Education Institutions (HEIs) organized programs for mapping: PHL-LiDAR 1 and PHL-LiDAR 2 • These programs are an extension of the Disaster Risk and Exposure Assessment for Mitigation (DREAM) LiDAR program FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Introduction – The Philippine Hazard Setting PHL-LiDAR 1 PHL-LiDAR 2 Data Acquisition Agriculture Data Validation Forest Data Processing Coastal Training & IEC Energy Data Archiving Hydrology Flood Modeling FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Introduction – The Philippine Hazard Setting • LiDAR mapping produces high-resolution geospatial data • High resolution data acquisition leads to humongous data sizes • Storage, indexing, retrieval and distribution proves a challenge FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Related Work FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Related Work - File Based Storage • File-based storage systems are commonly used to store data • Little setup needed • Pervasive technology • Complexity of directory structure increases with amount of data and processes FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Related Work - GIS-enabled RDBMS • GIS-enabled Relational Database Management Systems or Geodatabases • Two types of design: o Spatial indexing is on a separate layer o Specialized spatial columns • Indexing overhead, especially on updates • Scalability and query time issues • Limited support for point cloud data FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Related Work - Combined Approach • Combines LiDAR flat tiles, RDBMS, and distributed infrastructure o RDBMS manages metadata o LiDAR tiles are stored in dedicated and distributed storage o Data processing is carried out by high-performance compute servers FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Related Work - Combined Approach • • • • • • e.g. OpenTopography’s architecture Comprised of various software and hardware resources Actual data sets are stored as ASPRS LAS format on a dedicated storage server Metadata is stored on an IBM DB2 database Other datasets are stored on the SDSC Cloud platform Processing and visualization requests are handled by a dedicated large memory, multiprocessor system Image source: http://www.opentopography.org/index.php/about/systemarch2 FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Related Work - Combined Approach • While highly appealing, raises concerns: o Cost and difficulty of infrastructure upgrade o Internet connection speed and reliability FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Working Design FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Working Design - LiDAR Portal for Archiving and Distribution (LiPAD) FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Working Design -LiPAD • LiPAD, the customized GeoNode platform, handles the web user interface • GeoServer stores small shapefiles, raster, and vector data • Large files are tiled, named and indexed by the coordinates of each tile and stored in Ceph • Metadata of tiled files is indexed in LiPAD, represented by a tiled shapefile of the Philippines FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Working Design • Added features: o Authentication using Active Directory o Metadata Indexing for Ceph Objects, represented as a grid o Tiled selection of data o Data cart FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 LiPAD Web Interface - Tile Selection FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 LiPAD Web Interface - Data Cart FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Ceph Object Storage System FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Ceph Object Storage - What is Object Storage? • Data is managed as objects, storage containers with a file-like interface • Objects are retrieved by their unique ID • Offers storage size scalability • Replicated backups FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Ceph Object Storage – Ceph Features • Open source • Compatible with OpenStack and Amazon AWS • Support for broad spectrum of programming languages • Runs on commodity hardware • Designed to be self-healing and self-managing • Representational State Transfer (REST) API FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Ceph Object Storage - Ceph Features • Block storage or virtualized hard disks • Object storage accessible via o HTTP REST o OpenStack Swift or Amazon S3 API o C++, Java, Python, Ruby, PHP FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Ceph Object Storage - Architecture • Object gateway services handles requests applications • Monitor nodes ensure high-availability • Objects are stored inside Object Storage Devices (OSDs) FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Archiving Process Flow FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Archiving Process Flow • Post-processed data is tiled into 1x1 km tiles • Each tile is named after the northing and easting values (EPSG:32651) • Tiles are uploaded to Ceph and metadata is extracted • Metadata is input into LiPAD FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Summary FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Summary • There is a need for LiDAR mapping in the Philippines for hazard assessment and natural resource accounting • Archiving, indexing and distributing these sizable data sets prove to be a challenge • We use a combined approach, utilizing GeoNode with GeoServer, and Ceph Object Storage • The setup also paves the road for migrating to a distributed computing platform FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 References Ackermann, F., 1999. Airborne laser scanning—present status and future expectations. ISPRS Journal of Photogrammetry and Remote Sensing 54 (2-3), 64-67. Al-Naami, K.M., and S. Seker, L. Khan, 2014. GISQF: An Efficient Spatial Query Processing System. 2014 IEEE 7th International Conference on Cloud Computing (CLOUD), pp. 681-688. Amazon Web Services, 2015. AWS | Amazon Simple Storage Service (S3) - Online Cloud Storage for Data & Files. Retrieved July 2015, from https://aws.amazon.com/s3/ Boundless, 2015. GeoExplorer — GeoExplorer. Retrieved July 2015, from http://suite.opengeo.org/opengeo-docs/geoexplorer/ FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 References Chen, Q., 2007. Airborne lidar data processing and information extraction. Photogrammetric Engineering & Remote Sensing, 73(2), 91-95. Crosby, C.J., Arrowsmith, J R., Nandigam, and V., Baru, C., 2011. A Geoinformatics Approach To Online Access And Processing Of LIDAR Topography Data. In R. Keller and C. Baru, Eds., Geoinformatics: Cyberinfrastructure for the Solid Earth Sciences, pp. 251265. London: Cambridge University Press. David, N., Mallet, C., and Bretar, F., 2008. Library concept and design for lidar data processing. GEOgraphic Object Based Image Analysis (GEOBIA) Conference, Calgary, Canada. FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 References Fox, A., Eichelberger, C., Hughes, J., and Lyon, S., 2013. Spatio-temporal indexing in nonrelational distributed databases. 2013 IEEE International Conference on Big Data, pp. 291–299. GeoNode Development Team, 2013. About GeoNode — GeoNode 2.0 documentation. Retrieved July 2015, from http://docs.geonode.org/en/master/organizational/about.html#about Inktank Storage, Inc., 2015. Welcome to Ceph — Ceph Documentation. Retrieved July 2015, from http://ceph.com/docs/master/ FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 References Isenburg, M., 2012. LASzip:Lossless compression of LiDAR Data. European LiDAR Mapping Forum. Jones, T., 2010. Ceph: A Linux petabyte-scale distributed file system. Retrieved July 2015, from http://www.ibm.com/developerworks/library/l-ceph/ Levine, R., 1998. NAS Advantages: A VARs View. Retrieved July 2015, from http://www.infostor.com/index/articles/display/55961/articles/infostor/volume2/issue-4/news-analysistrends/nas-advantages-a-vars-view.html FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 References Lewis P., McElhinney C., and McCarthy T., 2012. LiDAR data management pipeline; from spatial database population to web-application visualization. Conference Proceedings at Com.Geo 2012, Washington DC, USA Mesnier, M., Ganger, G. R., and Riedel, E., August 2003. Object-Based Storage. IEEE Communications Magazine, pp. 84–90. Open Source Geospatial Foundation, 2014. About - GeoServer. Retrieved July 2015, from http://geoserver.org/about/ FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 References OpenTopography, 2015. NSF OpenTopography Facility | About. Retrieved July 2015, from http://www.opentopography.org/index.php/about/ Ramsey, P., 2013. LIDAR in PostgresSQL with PointCloud. Available online: http://boundlessgeo.com/wp-content/uploads/2013/10/pgpointcloud-foss4-2013.pdf San Diego Supercomputer Center, 2015. SDSC Cloud. Retrieved July 2015, from https://cloud.sdsc.edu/hp/index.php FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 References SwiftStack, Inc., 2015. OpenStack Swift | Enterprise Storage from SwiftStack. Retrieved July 2015, from https://swiftstack.com/openstack-swift/ The Apache Software Foundation, 2014. Welcome to Apache™ Hadoop®!. Retrieved July 2015, from https://hadoop.apache.org/ FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Acknowledgements FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Acknowledgements The authors would like to acknowledge the support of the Department of Science and Technology – Philippine Council for Industry, Energy and Emerging Technology Research and Development (DOST-PCIEERD) and the Phil-LiDAR 1 research and training staff. FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015 Thank you very much! FOSS4G 2015 – Seoul, South Korea – September 14th-19th, 2015