Precis 1 (reconstructing occluded surfaces…)

Liz Bondi V. Vaish, M. Levoy, R. Szeliski, C. L. Zitnick, and S. B. Kang. (2006, June). Reconstructing Occluded Surfaces Using Synthetic Apertures: Stereo, Focus and Robust Measures. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition [Online]. 2(CVPR’06), pp.2331-2338. Available: http://www.computer.org/portal/web/csdl/doi/10.1109/CVPR.2006.244 The paper begins with giving two problems that occur when reconstructing occluded objects: limited number of views and cost functions (functions that are used in 3D reconstruction algorithms) that assume that the camera can see all objects. To increase the number of views, scientists use synthetic aperture focusing, which is the use of a large camera array wherein the camera array’s view is wider than the occluding objects. As for the other problem, the Stanford scientists, who coincidentally helped work on the class example of the multi camera array, discuss the effectiveness of different cost functions to reconstruct occluded images. There are four cost functions that V. Vaish, M. Levoy, R. Szeliski, C.L. Zitnick, and S.B. Kang explore in this paper: shape from focus, shape from stereo, shape from median, and shape from entropy. Shape from focus and shape from stereo are similar in that they both use mean color to reconstruct objects, although stereo specifically uses variance. The problem with these two cost functions is that rays that hit occlusions are used in the reconstruction, but they should be considered outliers. Hence, shape from median is basically the same as stereo, but it uses the median color instead of the mean since statistics dictates that the median is the best measure of central tendency when there are outliers. Shape from entropy uses modal color instead of the median or the mean. In the three experiments conducted by the Stanford scientists, shape from focus was best for high percentages of occlusions with similar color, while entropy was best for low percentages of occlusions. This paper overall seems to be a good source for several reasons: it is written for Computer Vision and Pattern Recognition (CVPR), a conference that is part of the IEEE, it uses twenty sources to back up its claims, and it gives many diagrams, mathematical proofs, and equations that can help the reader visualize its claims. For example, when discussing stereo versus focus, there is a diagram of the camera layout, two graphs for intensity, and a final graph comparing the depth and response of stereo and focus. Also, in the beginning, the authors use a conclusion drawn by Schechner et al to “…inspire this research.” However, the paper does seem a bit unorganized. For example, one section is stereo vs. focus, but it does not define these cost functions until after discussing the specific cost functions, giving a theorem, and proving the theorem. Once it finally does give definitions, it gives them in several different ways throughout the paper and each definition seems to contradict another. Additionally, it does not give a clear conclusion about which cost function is overall the best. I got a lot from this source, especially since I had to look up so many details about algorithms, synthetic apertures and so forth. Once I understood as much of the paper as I could, I decided it was very useful to our project. Specifically, I found the experiments section helpful. This section gave experimental setups that not only validated the results, but also gave me ideas on how we could start building a multi camera array. For example, in an experiment in which there is an ivy wall in front of a person and a statue, they used 88 cameras. In an experiment in which there they tried to image a CD case behind plants, they used a single camera with a very large aperture. This paper can also be useful once a multi camera array is built and we need to combine images from many cameras to see the occluded objects. All we need to decide is whether the people in an airport will create a high or low percentage of occlusions. If it’s a high percentage, we can use the shape from focus cost function (𝑓𝑑 (𝑥) = − [ ̅ (𝑥) 2 𝜕𝐼𝑑 𝜕𝑥 ] ), and if it’s a low percentage, we can use the 𝐾 shape from entropy cost function (𝐻 = − ∑ 𝑏𝑖 𝑖=0 𝑁 𝑏 log 𝑁𝑖). Of course, we will need to figure out what algorithms are and how cost functions are specifically used, but this will save us from determining a cost function. Therefore, this paper was useful because it gave us information about what happens after we set up a few cameras, about how to turn many different images into an image that will show an occluded object. Now we have some ideas on how to “see through” occlusions.

Precis 1 (reconstructing occluded surfaces…)

Related documents

Products

Support

Precis 1 (reconstructing occluded surfaces…)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib