Report: Heat Kernel Signature Dafne van Kuppevelt, Vikram Doshi, Seçkin Savaşçı Introduction The Heat kernel can be thought of as the amount of heat that is transferred from x to y in time t given a unit heat source at x. The Heat Kernel Signature (or HKS), is obtained by restricting the well-known heat kernel to the temporal domain. Our work described in this report is an implementation of the Heat Kernel Signature, including retrieval and visualization parts. To be more precise, our goal was to compute the HKS of 684 given 3D models and find a useful way to use the HKS for retrieval. To mark some of our implementation details here: We used mainly C#, but we also have a small part written in C++. Development is done in Visual Studio. In our final version, all external libraries have non-commercial license. Our project is hosted in Google Code: http://code.google.com/p/infomr-group-x For proof of correctness we’ve provided both visualization and graphs. The visualizer of 3D Models is developed on the XNA framework. A brief word about the method For compact M, the heat kernel has the following eigen decomposition: ∞ 𝑘𝑡 (𝑥, 𝑦) = ∑ 𝑒 −𝜆𝑖𝑡 𝛷𝑖 (𝑥)𝛷𝑖 (𝑦) 𝑖=1 Where 𝜆𝑖 and 𝛷𝑖 are the ith eigen value and the ith eigen function of the Laplace-Beltrami operator, respectively. Rather than using the heat kernel itself as a point signature, the Heat Kernel Signature (or HKS) is used by considering its restriction to the temporal domain. It has been proved in the ‘A Concise and Provably Informative Multi-Scale Signature Based on Heat Diffusion’ that HKS inherits many properties from the heat kernel, such as being invariant under perturbations of the shape- and therefore can be used in applications that involve deformable shapes. More remarkably, it has also been proved that under certain mild assumptions, the set of Heat Kernel Signatures at all points on the shape, fully characterizes the shape. This means that the Heat Kernel Signatures are concise and easily commensurable since they are defined over the common temporal domain, but at the same time, preserve almost all of the information contained in the heat kernel. The signature captures information about the neighborhood of a point on a shape by recording the dissipation of heat from the point onto the rest of the shape and back to itself over time. Because heat diffuses to progressively larger neighborhoods, the varying time parameter helps describe the shape around a point. This means, in particular, that the detailed, highly local shape features are observed through the behavior of heat diffusion over short time, while the summaries of the shape in large neighborhoods are observed through the behavior of heat diffusion over longer time. Method There are several ways to use the Heat Kernel Signature to compute the distance between objects. We tried and implemented four different methods, each being an improvement on the previous one. In all our methods we compute a feature vector for each model and compute the distance between models using the L2-norm on this feature vectors. Method 1 Our initial crude method was to compute the sum of the HKS for all vertices: 𝑁 𝐻𝐾𝑆𝑆𝑢𝑚𝑡 = ∑ 𝑘𝑡 (𝑥, 𝑥) 𝑥=1 We compute this for 8 different t’s, uniformly sampled on the logarithmic scale on [-4, 2]. This results in a feature vector of length 8 for each model. The disadvantage of this method is that the t’s are fixed among all models and relatively arbitrary chosen. Thus the time points may not be very representative for the model: for example in a larger model it takes more time for the heat to spread. Method 2 We therefore used a more advanced method to sample the time points. For each model, sample 100 points in the logarithmically scale over the time interval [tmin, tmax], where: 𝑡𝑚𝑖𝑛 = 4 ln 10 𝜆𝑒𝑛𝑑 𝑡𝑚𝑎𝑥 = 4 ln 10 𝜆2 Here𝜆2 is the second eigenvalue and 𝜆𝑒𝑛𝑑 is the last eigenvalue. We then again compute 𝐻𝐾𝑆𝑆𝑢𝑚𝑡 for each time point and retrieve a feature vector of length 100 for each model. We noticed that the 𝐻𝐾𝑆𝑆𝑢𝑚𝑡 remains almost unchanged for t >tmax as it is mainly determined by the eigenvector 𝛷2. Our feature vector was of length 100. Method 3 We can approximate the HKS by using only 300 eigenvalues and eigenvectors. We then get: 300 𝑘𝑡 (𝑥, 𝑥) = ∑ 𝑒 −𝜆𝑡 𝛷𝑖 (𝑥)𝛷𝑖 (𝑥) 𝑖=1 𝑁 𝐻𝐾𝑆𝑆𝑢𝑚𝑡 = ∑ 𝑘𝑡 (𝑥, 𝑥) 𝑥=1 We again sample the time points on a logarithmically scale over the time interval [tmin, tmax], but now 4 ln 10 𝑡𝑚𝑖𝑛 = 𝜆300 This method requires significantly less computation than the previous method, since we only need to calculate 300 eigenvalues and eigenvectors. Also the computation for the HKS of one vertex takes less time, namely O(300) instead of O(N). Our feature vector was of length 100. Method 4 In the previous methods, we sum over all vertices to get the signature of the complete model. This means that points on the model are not compared to corresponding points on another model. To accomplish this in the fourth method we find 200 distinct feature points on the model. These feature points have the highest local maxima, using multi scaling: 𝑡=𝑚𝑎𝑥 𝑑[𝑡𝑚𝑖𝑛,𝑡𝑚𝑎𝑥] (𝑥, 𝑥 ′) 𝑘𝑡 (𝑥, 𝑥) − 𝑁 ∑𝑖=1 𝑘𝑡 (𝑥, 𝑥 ) 𝑡=𝑡𝑚𝑖𝑛 = ∑ − 𝑘𝑡 (𝑥′, 𝑥′) ∑𝑁 𝑖=1 𝑘𝑡 (𝑥′, 𝑥′) This results in a feature vector of length 200 * 100 (200 vertices and 100 different t values for each vertex) for each model. Visualization results of highest n HKS selection : from L to R : 15,100,200,and 500 These persistent feature points shown above are the high curvature points which have a local maximum. As you can see from the pictures, using all vertices will not efficiently give the feature vector. However by taking a few vertices only, we will lose a lot of important information. Hence we choose to use 200 vertices. Implementation As soon as we became confident and we had a consensus about the ideas in our reference paper, we started the implementation. Rather than implementing all sequence together, we divided our implementation into 4 stages: the Initial Parser is where we compute Laplacian matrices from 3D models and the eigenvalues and -vectors from those Laplacian matrices; using these eigenvalues and vectors, we calculate the HKS feature vectors in the HKS Calculation stage; we focused on matching and distance calculations and the resulting performance measures in Retrieval; in theVisualization stage we visualize our results of the Heat Kernel Signature calculations. Initial Parser The Initial parser contains the operations that take much time, but luckily only need to be done once. These include the computation of the Laplacian matrices and the corresponding eigenvalues and eigenvectors. The main challenge in this stage is to find efficient techniques to perform this calculations and make sure that we can use the results of this stage in later stages easily. a. Computing Laplacian Matrices We tried to write the code for the computations ourselves as much as possible, without using external libraries. Our goal was to implement a method that parses 3D models and outputs the Laplacian matrices to text files, one by one. Our first successful implementation was using integers for keeping each element of the resulting Laplacian matrix; also it was giving the output in human readable format. But tests showed that it was slow and too much space consuming. Our test cases showed that if we wanted to parse every model, it would take more than 2 hours to finish. Also for the largest models (in terms of vertex count), it would go beyond x86 address space. To be more precise, our largest model has 45000 vertices, which results in a matrix data structure of size 45000 ∗ 45000 ∗ 4 𝐵 = 7.54 𝐺𝐵 which is greater than our accessible address space (2 GB, OS restriction) in Windows, also greater than 232 Bytes which we can address in x86. We came up with a solution, namely reducing our matrix and using a bit to represent each element of thematrix. The Laplacian matrix is a sparse symmetric matrix, where one can compute the diagonal values by traversing the corresponding row or column. We therefore decided to store only the lower triangular part of it and we also omitted the diagonal part because it can be computed by row traversal. The resulting datais a lower triangular matrix without the diagonal part.By definition of the Laplacian matrix, elements in this part can only take the value -1 or 0. We replaced-1s with 1s, so each element could be stored using a bit. If the program calls an upper triangle’s element we send the lower part correspondence, and if the diagonal part element is needed, we compute it on the fly. By using this schema, we were able to parse all files in 2 minutes including the largest ones that we couldn’t parse in 45000∗44999 2 the previous method. To be more precise again, we then needed only 𝑏𝑖𝑡𝑠 = 120 𝑀𝐵 space for each file. As a result we get a ~64X compression. Figure 1: Simple Model; Table 1: Our First Implementation; Table 2: Compressed Implementation The next decision we had to make was what storage format we would use for these Laplacian matrices. Our first implementation was giving a human readable text file, in which we converted the values to strings and added white spaces and line feeds for readability. However, such formatting increases the parsing time. In addition, we had to implement a loader method that loads these matrices into memory when we need it. We eventually decided to serialize our matrix data to files. With this decision, our resulting files were no longer human-readable, but we could simply deserialize it in our program to use it for further calculations. b. Computing Eigen Vectors & Values Computing eigenvalues and eigenvectors is a challenging problem, because of its high complexity. As we stated before, at first we tried to implement our own eigen decomposer. But after some considerable effort, we didn’t manage to implement a working version; so we decided to use a 3rd party library for the eigen decomposition. In literature, there are specialized techniques - such as Lanczos algorithm- for different kind of matrices, so we had to find a good library that fits to our input laplacian matrices which are sparse symmetric matrices. We could not find a library that had a specialized eigen decomposer for our type of matrices, so we started to try every library that we could find on the web, and evaluated them. library Spec License Time Memory Math.NET C# x86 Sequential LAPACK xGPL 20+ mins Average DotNumerics C# x86 Sequential LAPACK port xGPL 2 mins Average CenterSpace C# x86 Sequential LAPACK $1295 30+ mins Good Eigen C++ x86 Sequential UNIQUE xGPL 2 hours Good Table 3: Evaluation results for 2000 vertex model After the evaluation of some libraries, we decided use the DotNumerics library. Using DotNumerics, we found a solution to eigen decomposition, but it had some drawbacks in our project: We had to ignore the largest ~40 matrices: DotNumerics uses double precision for storing matrix elements, and although DotNumerics has a symmetric matrix data storage class, it results in a huge memory need which is more than x86 architecture can provide for the largest matrices. We needed to create a library copy of our Laplacian matrices during parsing, because DotNumerics uses LAPACK routines under the hood and that means we cannot use our bitwise matrix format as an input for Eigen Decomposition Methods. We also continued using serialization for outputs of the library. Computed eigenvalues and eigenvectors are serialized to corresponding files. c. An experimental Approach for Future Implementations Although we used DotNumerics through our project and ignored the largest matrices, we came up with a prototype for future work. Our prototype is written in C++ rather than C#, uses Intel Math Kernel Library©, Boost and a tweaked version of Armadillo libraries. This prototype parser runs on x64 architecture and it takes only 10 seconds to compute the eigenvalues and eigenvectors of a 2000 vertex model (recall that DotNumerics does the same job in 2 minutes). We had two main reasons for developing this prototype. At first, today’s CPUs are mostly x64 capable, and we wanted to see how much it could improve if we run the same implementation on x64. Secondly, an x86 Windows OS limits the available memory to 2 GB per process; we need more memory than that if we want to parse the largest matrices. However, the performance of this prototype is not better. We tested this prototype on a machine that has 3GB main memory, the rest of the data is stored on virtual memory. For the largest matrices, the disk I/O and as a result of it, the computation time increased dramatically. HKS Calculation In the final version of our program, we calculate HKS feature vectors in three different ways: corresponding to the second, third and fourth method described earlier. The implementation of these calculations is straightforward: for each model, the program reads the eigenvector and eigenvalues files and uses these to compute the feature vectors. Retrieval In the Retrieval program, the null-transformation of each model is found and used as a query. For increasing scope size, the program retrieves the closest items (in terms of L2-distance between the feature vectors) and computes the recall, specificity and precision of the retrieval. This results in values for an ROC-curve and a Recall/Precision graph for each model. Visualization &Proof of correctness To show that our computations are correct, we visualized models with their HKS. For the 3d model visualizer, we used XNA framework. Our implementation reads a 3D model and the corresponding HKS value file and draws a 3D scene. One can navigate in this scene using an XBOX controller. Also if provided, you can see the different results of model in different time values. Intuitively kt(x,x) can be interpreted as probability of Brownian motion t return to the same point x after time t i.e. the weighted average over all paths possible between x and x in time t. Below we have shown how visually the heat transfer takes place in the model for the different time step t. When t is infinitesimally small (tending to 0, but not 0) then the object is completely red, since we apply a unit source of heat at each vertex- since a very small amount of heat is transferred. Moreover the parts with a higher curvature always show more redness or heat content. However as time moves the object gets colder and colder and from red it turns green and eventually blue. Color changes under different t values: from left to right t values increase Performance All three methods using the Heat Kernel Signature seem to work well for some types of transformations and not at all for other types. This is very well visible in the ROC curves: for most models, the curve is flat with a very high specificity at the beginning, and then suddenly drops. This means that the first models retrieved are all of the same class, and after it drops, the remaining objects of the same class are hard to distinguish from objects of other classes. It is also clear from the ROC curves that the methods are improvements of the previous one. Whereas the first method does not work very well for all models (for example the Flamenco curve is very low), the second method has high curves for all models. The ROC curves for the third method are even higher. Method 1: Sum of all vertices 1 Man 0.9 Dog 0.8 Cat licking her hand 0.7 Man 0.6 Woman Stationary Horse 0.5 Camel 0.4 Cat 0.3 Elephant 0.2 Flamenco 0.1 Running Horse Lion 0 0 0.2 0.4 0.6 0.8 1 Figure 2: ROC curve for Sum of all vertices 1 0.9 Man 0.8 Dog Cat licking her hand 0.7 Man 0.6 Woman 0.5 Stationary Horse Camel 0.4 Cat 0.3 Elephant 0.2 Flamenco Running Horse 0.1 Lion 0 0 0.2 0.4 0.6 0.8 1 Figure 3: Recall/Precision graph of Sum of all vertices Method 2: Sum for 300 vertices 1 0.9 Man 0.8 Dog Cat licking her hand 0.7 Man 0.6 Woman 0.5 Stationary Horse Camel 0.4 Cat 0.3 Elephant 0.2 Flamenco Running Horse 0.1 Lion 0 0 0.2 0.4 0.6 0.8 1 Figure 4: ROC curve for Sum for 300 vertices 1 0.9 0.8 Man 0.7 Dog 0.6 Cat licking her hand 0.5 Man 0.4 Stationary Horse 0.3 Camel 0.2 Cat Elephant 0.1 Running Horse 0 0 0.2 0.4 0.6 0.8 1 Figure 5: Recall/Precision graph for Sum for 300 vertices Lion Method 3: Scaled vertices 1 Man 0.9 Dog 0.8 Cat licking her hand 0.7 Man Woman 0.6 Stationary Horse 0.5 Camel 0.4 Cat 0.3 Elephant Flamenco 0.2 Running Horse 0.1 Lion 0 0 0.2 0.4 0.6 0.8 1 Figure 6: ROC curve for Scaled Vertices 1 0.9 Man 0.8 Dog 0.7 Cat licking her hand Man 0.6 Woman 0.5 Stationary Horse 0.4 Camel Cat 0.3 Elephant 0.2 Flamenco 0.1 Running Horse Lion 0 0 0.2 0.4 0.6 0.8 1 Figure 7: Recall/Precision graph for Scaled Vertices Conclusion We have described the implementation of three different variations of the heat kernel signature. The ROC and precision recall curves show that using one of these methods is a reliable way to match 3D object. The disadvantage of these methods is that computing the eigen decomposition is very costly operation (using the DotNumerics library) . However, we expect the prototype we have created using Intel Math Kernel Library©, Boost and a tweaked version of Armadillo libraries to give us faster and more efficient results. The efficiency could be improved even more by not using a eigen decomposition but using a different approach to calculate the heat kernel. For the heat kernel to exist, the model should be a compact Riemannian manifold possibly with a boundary, hence a signature method which works for holes, partial and view needs to be developed. In addition, an implementation of the Laplace-Beltrami operator instead of using the Laplacian matrix would give better results, since the entire geometry of the model is taken into consideration - not just whether two vertices are adjacent (as in the case we implemented). Moreover, we also see the need for more analysis to be done in finding the optimum number of scaled vertices and their corresponding time steps.