Erik de Jong & Willem Bouma Arithmetic Coding Octree Compression Surface Approximation Child Cells Configurations Single Child cell configurations Results Questions Assign to every symbol a range from the interval [0-1]. The size of the range represents the probability of the symbol occurring. Example: A: B: C: D: 60% 20% 10% 10% [ 0.0, [ 0.6, [ 0.8, [ 0.9, 0.6 ) 0.8 ) 0.9 ) 1.0 ) (end of data symbol) A B C D Ranges: A – [0,0.6), B – [0.6,0.8), C – [0.8,0.9), D – [0.9,1) The example shows the decoding of 0.538. Huffman coding is a specialized case of arithmetic coding Every symbol is converted to a bit sequence of integer length Probabilities are rounded to negative powers of two Advantage: Can decode parts of the input stream Disadvantage: Arithmetic coding comes much closer to the optimal entropy encoding SQUEEL Estimate/Approximate as much as possible Store only the differences w.r.t. the estimate A better estimate -> smaller numbers lower entropy better compression What is an Octree? Per cell we store only the occupied child cells Options: Store a single byte, each bit representing a child cell (for example 11001100) Store the number of occupied cells e and a tupel T with the indices of the occupied cells (for example e=4, T={0,1,4,5}) We will approximate/estimate/compress: The surface Number of non-empty child cells Child cell configuration Index compression Single child cell configuration Every level of the octree will yield a preliminary approximation Q of the complete point cloud P For a cell that is to be subdivided: Predict surface F based on Moving Least Squares (MLS) on k nearest points in Q Prediction of number of non-empty cells e Based on estimation of the sampling density ρ Local sampling density ρi at point pi in P: k, nearest points pi, point in the point cloud P qi, point on the surface approximation Q Sampling density: Guess the number of child cells e based on: The area of the plane F The sampling density ρ Quality of prediction Graphs show the difference between te estimated value and the true value. (a) The level 5 octree (b) The level 7 octree (c) The entire octree. Given the number of non-empty child cells e there are only a limited number of configurations: We have an array with all weighted possible configurations, sorted in ascending order Each configuration of the subdivision is encoded as an index of the array. Common configurations get lower weights, means smaller indices, means lower entropy. Cell centers tend to be close to F. To find the weight of a configuration: Sum up the (L1) distances from the cell centers to F Index of the configuration in the sorted array is encoded using arithmetic coding under two contexts First context: the octree level of the cell C Second context: the expressiveness e(F) e(F) reflects the angle of the plane to the coordinate directions In order to use e(F) as a context for arithmetic coding it has to be quantized. It has been found that five bins was sufficient and delivered the best results. We can exploit an observation for cells that have only one occupied child cell. Scanning devices often have a regular sampling grid. It is possible to predict samples on the surface, rather than just close to the surface. This is relevant for the finer levels in the octree hierarchy. For cells with only one child cell we can predict T based on the nearest neighbours of points m, centroid of the k nearest neighbours Quite suprisingly, the cell center projections on F that are farthest away are most likely to be occupied. The area farther away can be seen as undersampled. So a sample in that area becomes more likely since we expect the surface to be regular and no undersampling should exist. The weights for the eight possible configurations are given as c(T), cell center of T prj(F,c(T)), projection of c(T) on F Extra attributes can be encoded with an octree Color Normals Compressing color Same two-step method as with coordinates Different prediction functions are needed Model Number of points Raw size Compressed (bpp) Compressed size Dragon 2.748.318 31,44 MB 5,06 1,66 MB Venus ~134.000 1569 KB 11,27 184 KB Rabbit ~67.000 768 KB 11,37 93 KB MaleWB ~148.000 1734 KB 8,87 160 KB (bpp) Bits Per Point Raw size is assumed to use 3 times 4 Bytes per point. The octree uses 12 levels. Except for the Dragon for which it is unknown. (b) uses 1.89 bpp ?