>> Sing Bing Kang: Okay. I'm going to... Thanks for coming. It's a pleasure to introduce Katsushi...

advertisement
>> Sing Bing Kang: Okay. I'm going to get started. Good morning everyone.
Thanks for coming. It's a pleasure to introduce Katsushi Ikeuchi. He's a
professor University of Tokyo. Katsushi is very well accomplished. He has
worked in many areas in computer vision, robotics and graphics. I'm saying that
not just because I am a former student of Katsushi. He is -- he has been an
IEEE fellow for 10 years now and today he's going to be talking about a subject
of monumental proportions. Katsushi.
>> Katsushi Ikeuchi: Thank you very much. So I'm talking about a project called
e-Monument Project, and what is e-Monument? Basically there are many
historic sites, so we digitize this kind of historic site, and then later we can use for
scientific investigation or amusement or whatever.
And for this, you know, we have to worry about a couple of things. Why we have
to do this kind of project? One is monumental priceless and irreplaceable, so
should we pass it down to the future generations.
And also provide a computer vision and computer graphics (inaudible). And in
this e-Monument project there are a couple of technical issues. One is sensing,
one is content making and one is communication and one is display.
And today I am mainly talking about the sensing issues, and later maybe the
other day in another sense I'll talking about display and communications. And in
this e-Monument Project there are two ways to do so, and one is to this scenario
such as just using pictures and generate a kind of e-Monument. Another way is
to actually capture 3D dimensional measurement and then using such 3D data
for this kind of displays.
And I prefer just a 3D scenario, so today I'm talking about 3D scenario and
especially sensing issues. And so in this 3D scenario monument 3D model and
images. So first we sense the are of historic site. This case this is (inaudible)
Buddha, and then from this we obtain various 3D data and by combining this
piece of information to generate this kind of contents.
And in this paradigm first we have to worry about shape information and later
photometric informations.
First geometric informations. In this geometric information first we obtain 3D
datas and this 3D data is a piece of information and of viewing the directions. So
we have to determine reactivations and then to contact all together.
>>: (Inaudible).
>> Katsushi Ikeuchi: (Inaudible). And this is area which we are.
>>: The Bayon Temple located at the center of Angkor Thom unites the outlook
on the universe of ancient India and the tradition of Khmer. The temple was
constructed around the end of the 12th Century to bring relief to the crisis in the
Angkor era. It is well known for the appearance though for example come
smiling faces on towers and double corridors carved in beautiful, interesting
relief.
>> Katsushi Ikeuchi: Actually this is a temple which we are working. Why I'm
working in this particular Bayon Temple? First of all, this is huge structure, so
quite challenging object and also quite complicated and beautiful.
Thirdly central tower is inclining one (inaudible) per each year or something. So
there is possible corruption in future. So before corrupting, it is good idea to
obtain 3D data of this temple.
Now this size demands lots of the problem in all step of this 3D processing. And
first of all, data acquisition. And data acquisition of seen me ask, you know, we
obtain two datas, one is color images and probably you know the color image
basically of each picture you block this information, red, green, blues.
What is this image? Where are these picture we store distance of corresponding
pictures. Now, how to obtain data. We usually use range sensor, time applied,
project laser line and returns time of light and then from that we can determine
distance. This is called time of light. And then one of the commercially available
sensor is this kind of Sylox (phonetic).
And one who is -- this range data we obtain and from one view it looks black and
white image but since we know the distance we can rotate. Yes, there are many
available sensors from Sylox to vivid from up to 100 meter, five meter, five
millimeter resolution vivid around 20 to 30 centimeters, 0.1 millimeter resolutions.
>>: (Inaudible).
>> Katsushi Ikeuchi: Just 400K.
>>: 400?
>> Katsushi Ikeuchi: Yes. No, no, no, 400,000. But, you know, problem is all
are ground base.
>>: (Inaudible).
>> Katsushi Ikeuchi: 400,000, not 400 -- how -- 1,000 million U.S. dollars.
3,000.
>>: 300.
>> Katsushi Ikeuchi: 300,000, right?
>>: $300,000. That's U.S. dollars.
>> Katsushi Ikeuchi: That's commercially available sensor. Who cares, you
know. Yeah, program is ground based. What's wrong ground based? Basically
ground based sensors from ground you see all the point. But of course some
portion is missing because due to the occlusions and what usually we can do is
build a scaffold around those and bring up the sensor on top of the scaffold.
But we are talking about a Bayon Temple that is 150 meter, 50 meter, 45 meter.
So we cannot be -- of course we can, but you know it's not good, you know, for
business.
So what we proposing is why don't you hang range sensor on the balloon, then
you can get on to any viewing point. And this is a scene which we do. So we
build the balloon and hung the range sensor. Of course ->>: (Inaudible) hanging from a balloon.
>> Katsushi Ikeuchi: Yes. (Laughter).
Actually this is home-made, you know. We build for us. That's only 200,000 U.S.
dollars, not 300,000.
But anyway, of course by using this method you can get guarantee any viewing
point but problem is sensor moves during these acquisitions so there's other data
that I could use. And this is not good, right? So what we did? Well, we mount
TV camera on top of laser range sensor. Then you can obtain both as many
sequence as range datas.
So what we can do of course I'm from -- originally from CMU, so we have to use
famous professor technique, so-called optimization. And by using motions and
this is a feature tracking result, and this observation metric consist of motion
metrics and shaped metrics. And since multiplication consist of rank three, so by
using this rank three constraint we can divide, we can recompose motion metrics
into -- no, no, observation metrics into motion metrics and shape metrics.
But unfortunately this motion metrics obtained is not so accurate as to be able to
rectify the data. What we can do? Actually we have three data, distorted data,
image motion and balloon motion. So from this, we can extract three constraint.
Later explain what is three constraint. And from using this three constraint we
can set up cost function for estimate of motion camera and iteratively obtain
accurate estimation of sensor motions.
What is three constraint? First of all, of course you know factorization determine
shape information, meaning three dimensional data of each picture with respect
to one particular coordinate system. Also factorization missiled estimate what
kind of motion this balloon needs, TF and RF.
Also, since this sensor contains range sensors, so range sensor also measures
the same point with respect to this sensor coordinate system. So coordinate of
this point with respect to motion, with respect to factorization and with respect to
this sensor should be correspond. So we can set up this kind of recreation.
Basically 3D data obtaine4d with this range sensor should be correspond
factorization result adjusted by sensor motions. This is first constant.
>>: At what rate does the range sensor scan and what's the pattern of the scan?
>> Katsushi Ikeuchi: Actually this sensor scan entire scene around two to three
second.
>>: It goes 360?
>> Katsushi Ikeuchi: Yes. No, no, three seconds. Three second and around
say 400, 600 pictures.
>>: Three seconds in four ->> Katsushi Ikeuchi: Yeah, yeah.
>>: (Inaudible).
>> Katsushi Ikeuchi: I'm not sure.
>>: So you think that every single pixel every single pixel will have a different
RT?
>> Katsushi Ikeuchi: Yes, yes. But you know, unfortunately factorization
missiled doesn't have 3D data every single feature. Only factorization has 3D
data of feature point coming from TRK (inaudible). No TK -- KRT (inaudible). So
only we are measuring KRT features.
Second constraint, bundle adjustment. Once you know the motion, the estimated
motion the project factorization result then that point should correspond to image
point. That is standard way. And this is one adjustment. Power constraint. I
have been working for long time for (phonetic) and origin is coming from
smoothness constraint, so I have smoothness constraint.
And the reason why I use balloon instead of helicopter is helicopter has high
vibration, while balloon has smooth motion. Due to that, we can assume balloon
motion is smooth. So we say this kind of smoothness constraint should be
satisfied.
By probing this range data constraint, smoothness constraint, bundle adjustment,
range data constraint and smoothness constraint we can set up minimization
formula.
But for this minimization formula, of course we need good initial conditions. And
for that we are using factorization result. So initial estimation is given
factorization result and then we modify that factorization result by using this cost
of functions and refine motion parameters, and once you estimate motion
parameters from that we can obtain shape recovery. Actually this is an ICCV
papers long time ago and this is a result, also this is a distorted range data
sequence. These are the results.
Actually turn our sensor more vibration, better. And this is totally distorted result,
motion sequence. Besides this, we also developed several other sensor, too,
and all together this kind of scene.
>>: In order to scan large architectural structures such as the Bayon Temple we
have to use different sensors depending on the location of objects in the site. To
scan the deity faces of Bayon, we used a long range laser sensor named Sylox.
We measured each face from many positions such as the ground with scaffold
on the roof and a bucket lifted up by a crane. The data from different directions
were integrated and a 3D digital model of each face was built. To scan the
narrow space between the terrace and the corridor, the laser sensor named
Sylox which moves vertically along the ladder had been developed and was
used. Bayon Temple is a huge architectural structure with a large number of
high towers and it is not practical to scan the upper site, especially the roofs from
scaffolds. For this task we use the balloon sensor, the laser sensor suspended
under a balloon which had been developed for this purpose. Two different types
of laser sensors were alternatively equipped depending on the distance to the
target. The balloon was manually controlled by four ropes pulled from the
ground.
>> Katsushi Ikeuchi: So this is a story how to obtain range data, but this is not
the end of the story. We have to obtain 2,000 -- 10,000 range data from various
viewing directions, so we have to determine these reactivations. And since we
are talking about the 10 to 20 range datas of course they are commercially
available alignments of theirs to determine the reactivations. But we are talking
about 10,000 and quota terabyte overflow PCs.
What is alignment? Well, basically this is one range datas, this is another range
datas. We have to determine reactivations set of the correspondence and then
gradually reduce the gap. And once both data correspond we can determine
rotation translation. That is process called alignment.
Then what is the issue in alignment? We are talking about quota terra, so
requires large amount of memory, actually that overflows batch memory and also
wrong computational time, so we have to do something.
So we develop a two-step alignment, one quick alignment using graphics
processing, and this is the activity approximate the solution but quick. And this is
initial alignment. And once onsite alignment was done, we bring back all the data
to the university and ran simultaneous alignment on PC cluster.
First of all, why do we need this quick parallel alignment? If computational time
require long computational time student become dry up and dead and that's not
good, right, so we need a quick alignment.
And what we did, well the -- most time-consuming portion is correspondence
between data point. If data point is say in a six power, then a computation and
12 powers, and in order to reduce this computation, we did a little
correspondence basically by using graphics processing unit. And the first image
is, let's see, first message, first message projected the graphics processing unit
and each its own colors and such color distribution is generated over the
graphics processing unit. And we are assuming viewing directions and then
project secondly made and then finds corresponding color. And from that we can
set up what kind of correspondence is going on between first view and second
view.
And this is wrong actually. This is wrong but it's past. And then computational
time become order of in. And do you see this is a number and this is a time so
this is (inaudible).
And of course this is wrong, but that provide a good initial estimation, and that's
enough for onsite sensor planning as well as initial solution for next processing
softwares.
Now let's talk about prior possessing softwares. Why we need the prior
processing softwares? First of all, usually this alignment is done sequentially.
First alignment, second alignment, third alignment, fourth alignment, the other
way around. Huge gap occurs. So we plotting all the data all together in the
memory and determine reactivations simultaneously. This is called simultaneous
alignment.
However, this is a large number of (inaudible) so first we did (inaudible) and
efficient computational missile. Basically cosign is one and sign is (inaudible) like
this and set up the metrics. And secondly, since this is a sparse metrics, so I
use -- we use ICC and then we can reduce the computational time.
Second issue, large memories. Since from the original initial alignment we know
which data is corresponding which, so we mix a kind of a group of data and each
group is assigned to each PC in PC cluster.
And this ending result is sent to server computers and so server determine
metrics computation where client calculate correspondence.
>>: A single measurement is not sufficient for modeling a huge structure like the
Bayon Temple, therefore multiple measurements from different viewpoints are
required. In order to construct the entire Bayon 3D model it is necessary to align
multiple measurements. Given an initial guess of the positional relationship
between two mesh models and developing algorithm can align them as if you are
putting a jigsaw puzzle together in 3D. The result is correct estimation of the
positional relationship. Iteration of this alignment process constructs the 3D
model. However, this alignment method aligns only two mesh models at each
step, therefore alignment errors accumulate in one location. So we ->> Katsushi Ikeuchi: Implemented simultaneous alignment algorithm. And then
this is aligned result. And this is around maybe thousand gigabyte or 1 -- 0.1 gig
a -- no, 1,000 gigabyte or something.
And then this is to point representation. So we have to connect data. But again
the same story. So I will skip this portion and basically what we did, we have to
do is connect first image and second image and for this again, long
computational required so we put this processing prior computers.
And ending result is this one. And this is generated data and the 145, 145, 45
structure is represented two millimeter resolutions. And this is on the outside, but
also we obtain inside data, too, so you can measure the structure of the Bayon or
do the simulation and also you can go inside of this data, too.
>>: (Inaudible)
>> Katsushi Ikeuchi: Yeah, well, we carefully determine which portion connect
inside and outside. So wall two. Image wall two. Now, the main reason why I
went Bayon was originally when I returned to Japan, CMU mainly I'm modelling
this kind of inside small object, and I'm interested in that, so however once we
return to Tokyo I'm interested more object, outside object and it was well to
model. So one of the such target was (inaudible) Buddha. That is located
outside and a historical monument.
And then I scan (inaudible) Buddha and I exhaust the big Buddha in Japan, so I
in Thailand and I scan Thailand. Then at that time Thai people say next country,
neighboring country Cambodia has temple called Bayon Temple and it has 173
big Buddha. So I said oh, that's interesting, so I went to Bayon Temple. And
then it turned out well, of course to measure each face is interesting, but also
interesting structure. So I examine this entire structure scanning business.
But of course I also measure the deity faces, and we measure 173 deity faces.
This is one example. And this is a library of 173 faces. And (inaudible) face.
These 173 faces can be cross filed into three groups. (Inaudible). So I check
whether we can group these 173 into three groups. First library is going on. So
user, I took user face cross files fires and it turned out, yes, it is possible and we
can make three groups, (inaudible. What is (inaudible)? Well, this is (inaudible).
Typical (inaudible). This is (inaudible). And good point of 3D measurement is,
you know, by using such 3D data we can do this kind of scientific investigation,
too, actually. And this is (inaudible).
And also there was a rumor that a couple of independent worker group and they
did construction of Bayon Temple. So what we did was we did a clustering of
similarity of this entire faces and it turned out there is a similarity group, and such
similarity group this is one similarity group, this is the second similarity group, this
is the third similarity group, this is the fourth similarity group; namely, similar face
approximate positions. That supports a conjecture that few independent team of
workers and each worker, worker group worked this area, this area, this area,
this area, and basically the rumor says first master this space and then follow our
student came with this kind of surrounding face and maybe we have to check
more this kind of a conjectures. And currently one of my grad students is
working on that issues.
And also interesting point is it turned out by comparing this entire Bayon structure
somehow this structure is counterclockwise 0.94 degrees and why 0.94 degrees
and why they are slightly counterclockwise? But no one knows.
But anyway, this kind of finding is obtained from this measurement. Also,
pediment. I didn't realize this pediment is important but the day before yesterday
I gave a (inaudible) and one of the Department of Buddhism professor is quite
excited. By the way, what is pediment. Well, this Bayon Temple reviewed
couple times due to the previous -- I don't know, what is pediment in English?
Window caving or something. So to what is Christmas tree? Christmas tree is to
-- to decorate tree, right.
>>: Ornament.
>> Katsushi Ikeuchi: Yes. So decoration of windows, that is called pediment. I
don't know. But however, this pediment is well hidden. And even though
currently you visit Bayon Temple, you cannot see like this, you know, this is
around 40 centimeter and quite dark, and so what we did was put sensor here on
the meter and through this meter we measure various portions of the pediment
and put it together.
And this is the world premier this is world premier too much, you know, I'm saying
this 25th world premier. And then I show this one. And I'm not -- realize this is
quite important, but according to a professor this is quite power. Why? Because
originally there was a Buddha and that Buddha was caved out and right down
some see bar structure, so that is important. But I didn't realize it.
Other drawing like this. And even though you can see this picture. But
previously, before us, there is no picture existing because even though you go to
Bayon you cannot take pictures at all. So no picture at all. So this is really world
premier, 20 pictures of world premier.
And according to professor this is quite important figure but I'm not sure why this
is important, but according to him -- yeah, around this here is important. But for
me just you know junk.
I like this one. Seems to me quite nice. But according to him this is not
interesting. (Laughter).
Well, second issue text sharing, and there are lots of issues. First of all, how to
determine reactivation between camera and color sensor. But I'll skip this
component today. And second issue on text sharing is elimination variations.
And of course there are many technique in graphic society to smooth out this
kind of gap. Because you know this Bayon Temple is a quite large structure, and
even though you take picture around here moving around you become an
afternoon.
So color is different. And of course there are many technique to other age and
removes gap. But for a color point of view we shouldn't put a particular on this
kind of art, historic side. Well, this Bayon probably took it but you know if we are
talking about a historic monument, original color is important. So key question is
how to obtain real colors, how to obtain -- how to remove effective sunlight?
And we adapt this program, and do you know observation spectrum is
multiplication of software spectrum and illumination spectrum of each RGB. And
in order to simplify the story, we assume by the way color constancy. So
basically observation color is multiplication of surface color and illumination color,
basically graphics case you can generate this one and the vision case we have
to serve this program. And I shouldn't say too much but you know, vision may be
different than graphics.
But anyway, we have to serve this problem. Now, in order to simplify the story I
use a narrow band assumption. Basically each RGB is coming from just, you
know, clinical data of particular wavelengths. Then our component of image is
surface color R and illumination color. Still this kind of (inaudible). And also
when we're on the UTEs intensities, and even though observation is same,
sometime mobile brighter softwares and dark illumination or dark softness and
brighter illumination provide the same solution. So we cannot obtain near
intensity value or the other value, rather -- deductive value.
So in order to compensate such ambiguities usually we can use so-called
chromicity divided by G divided by B or divided by R divided by B or something
like this.
And there are two chromicity, but for unity we prefer this one. Now, still this
chromicity space previous observation, previous observation we -- observation
equation hall. By the way, basically we obtain R and G and SR, ER, SG, EG is
unknown, and each picture four unknown while only two equations, so it pose a
problem. And even though you create life source, you increase an unknown
more and you cannot solve. So we have to introduce some assumptions.
Introduce black body ideation assumptions, namely all illumination is coming
from this black body variations. Basically if heat this black body depending on
the temperature different color appears. And this color is called black body
variation colors. And for example snow is blue, and this is high temperature, and
gas banner surrounding area due to loss of oxygen you can see this kind of high
temperature where central area is less oxygen so lower temperature. And also
some new sign is around here and RD sign is around here like this. So natural
illumination can be approximate this black body assumptions.
It turns out (inaudible) of this black body assumption, black body color is
approximately a straight line. And by the way, EG 1 is divided by EG and 1
divided ER is like this straight line and we measure in Tokyo I think it's the same
in Seattle, but since we are measuring in Tokyo so this is Tokyo data, and this
MMC is a known parameter. Now, by using this assumption what we can do is
observation recreation. IG and R -- IR is no observation and this is unknown.
By the way, we can modify this equation like this. Then this is observation, this is
unknown. This is observation, this is unknown. From the black body assumption
EG is this kind of equation. MMC is known parameters. By plugging this
component to the one, this component to this one, we can set up this kind of
equation, SG and here SG, surface color is unknown value. While IG, IR is
observation value. This is known parameters. So at each point when you
observe you can write it down one straight line. This is a possible surface color.
And fortunately depending on observations, this straight line is different. So one
observation provide one straight line, the other observation another -- provide
another straight line, and from this intersection provide true color. Now,
fortunately this is a bit of a stable so -- and this is example. Given four PM image
and four PM image, one PM image from this we can obtain the color.
Now, unfortunately this is still unstable, so we introduce couple will with a
constant such as solution area of this area and if solution is outside we forced to
move just inside. Then this is true colors, and using this -- and this is true color
actually, and this is a combination of the various, you know -- actually this is the
light source color and this is resulting observations.
And by -- and this is true colors, and this is introducing constraint while this is
original solution. We are claiming that our method is better than previous method
with respect to comparing the true colors.
This is input, this is true color, this is our solution, this is original solution.
And by pasting this kind of color to this variant, and we don't need a real color
because this is we talk about you know by pasting this real color we can obtain
this one. But still there's ambiguity in intensity, so by adjusting histogram
between intensity between and brighter area and darker area we can obtain this
kind of result. And by using this kind of data, one of the Japanese printing
company generated this kind of batch to movies.
A little bit of content issue. Why we are working 3D. Because by using 3D we
can generate various content. For example, in just another big Buddha case
(inaudible) big Buddha was the built one, originally built in 8th Century, but
banned twice due to the domestic civil laws.
Back to original shape. Fortunately we have lots of literatures and also miniature
models, so by combining this literature information, miniature information and the
current 3D model we can generate original situations.
>>: Another great Buddha is tourist temple is the most important and greatest
Buddha in Japan. The 50 meter bronze Buddha image surrounded by the great
Buddha hall was originally constructed in the 8th Century. Unfortunately,
however, civil disasters critically damaged these architectural models. The
currently existing Buddha in great hall were built during the 17th Century. We
have been attempting to restore this great cultural heritage object to its original
state through the use of digital techniques. First, the current view was captured
with three dimensional laser scanning. Using the digital model and the literature
survey enabled us to synthesize the original model.
In May 2001, we spent two weeks for digitizing this great Buddha. In order to
retain the higher viewpoint scanning was done not only from the ground but also
from the ceiling. First 180 range images were converted into a triangular based
mesh model, then all partial mesh models were aligned together avoiding error
accumulation by using a simultaneous registration algorithm. After aligning all
mesh models our uptree based parallel for the metric for emerging algorithm was
applied to determine the consensus surface and generate a unified mesh model
of the great Buddha. This information is vital for cultural conservation and will be
useful for further simulation of original state restoration.
In the history of Japanese Buddhism simple architectural styles have gradually
changed. The current Buddha hall built in the 17th Century is distinctly different
from the original style. Fortunately about 100 years ago architectural historians
built a miniature of the original state according to the literature survey and a
comparison of the architectures of the same period. We have also scanned this
model using laser sensors.
Due to the limit of the sensor's accuracy and measurement configuration,
however, the final portions of the miniature cannot be acquired at a satisfactory
level. Torshidash (phonetic) Temple is the construction of the 8th Century and
retains the ancient styles of Buddhist architecture. In June 2001, we went to
Torsha (phonetic) deity and took the partial 3D model totally 780 range images
were taken for 20 parts of the architecture. From the miniature model the global
positions and rough dimensions of the parts can be acquired. We extended our
alignment algorithm to be able to determine the optimal scaling parameters. The
miniature model was combined with parts models into the final one through this
extended algorithm.
The last step was the restoration of the great Buddha. The literature survey
enabled us to determine the dimensions of the original great Buddha. The
current model was morphed into the original model using these dimensions.
Through digital composing morphing technology using the archeological
knowledge, we can synthesize the original view of the great Buddha in its
(inaudible).
>> Katsushi Ikeuchi: And actually today's talk was based on a Bayon project and
sensing issue how we design this balloon sensor to obtain various viewing
directions and the huge data for alignment algorithm and further consistency and
ideation color estimations. And (inaudible) I'm talking about the remaining issue
to this scenario and also the faces, but anyway, not here.
And sponsoring roughly 10 million U.S. dollar for just 10 years. And by the way,
due to -- from this project I produce 10 Ph.D. and I collected that all Ph.D. in one
book so I'm giving this to Sing Bing Kang.
>> Sing Bing Kang: Oh, thank you.
>> Katsushi Ikeuchi: And this is based on archiving cultural object. So if you are
interested, please read. I'm advertising. And also Website, too. Thank you very
much.
(Applause).
>> Sing Bing Kang: Are there any additional questions for our speaker?
>> Katsushi Ikeuchi: Yes?
>>: Who gave you 10 million dollars? (Laughter).
>> Katsushi Ikeuchi: Minister of Education actually.
>>: Sorry?
>> Katsushi Ikeuchi: Minister of Education. Because culture is important. But
seriously, this project is not culture base preservation, it developing sensing
technologies and well as software technologies. So due to that balloon sensor
(inaudible).
>>: So Katsushi, is the data going to be available to the public?
>> Katsushi Ikeuchi: Yeah, that's a touchy issue. The reason is in order to
obtain this data, we negotiate a Cambodian government, and I tried to push them
to make this one publically available, but still there is negotiations going on, and
still some component, some portion is still missing, so even this August we are
returning to Cambodia again. And once completed, yes, I'll try to make this data
available.
>>: What about the previous like they've done like the Nara Buddha ->> Katsushi Ikeuchi: The Nara Buddha is more difficult. The reason is that is
active target and temple is quite reluctant to make that open. But I shouldn't say
so but one of the reason I go to Cambodia is Bayon Temple is temple, but not
active religious target relatively easy to negotiate actually.
>>: (Inaudible) range scanning versus very dense scanning and stereo. And
some of the results I've see from stereo lately seem like to indicated that you
could almost do everything in photos (inaudible).
>> Katsushi Ikeuchi: Actually if that 3D object has lots of geometric features, that
stereo system works well. But if the object is less geometric feature such as
carved surface and smooth surface then sensor is better. So recently I'm
thinking to combine dense sensor and binocular stereo system. But of course in
a binocular stereo we have long way to go actually.
Actually binocular stereo system I didn't talk, but texture mapping is easy. So we
are using for texturing first we obtain three dimensional shape as well as color
and then align that 3D on top of that range data obtained by range sensor, then
we can texture over the range data image actually.
>>: Bayon Temple there's still lots and lots of holes everywhere you look you'll
keep finding little holes.
>> Katsushi Ikeuchi: Gradually number of the holes is reducing, but you know,
still --
>>: Are you going about that in some systematic way where you say okay,
here's a hole (inaudible).
>> Katsushi Ikeuchi: Yeah, actually this is purpose of why we are obtaining 3D
data and for graphics purpose we can use full feeling technique and remove the
hole, but that is artificial and some of the architect doesn't like it and they want to
these datas. So even though there may be a whole, that's fine, but precise
accurate data is better than ->>: (Inaudible) merger between the scan data and stereo, you can send
somebody in with a model where holes are carefully photographed and from
many directions rather than (inaudible).
>> Katsushi Ikeuchi: Actually stereo system has best (inaudible), so, you know,
we combine stereo system and range data we can remove the hole actually.
>>: When you show close-ups of the Bayon Temple walls there was some
sensor lines (inaudible) surfaces. It wasn't as smooth. So (inaudible) process
the data (inaudible) of the Bayon Temple?
>> Katsushi Ikeuchi: Which one, you know, I sensor noise.
>>: (Inaudible) they look kind of jagged ->> Katsushi Ikeuchi: Yeah, I know. Somehow our emerging program doesn't
make, how do say, sharp edge, you know, due to the limitation of (inaudible)
actually. And this case resolution was not two centimeter resolution, rather this
video is 10 centimeter resolution.
>>: But you use (inaudible).
>> Katsushi Ikeuchi: Maybe we can do that. But for that we need, yes, I'm
talking about (inaudible) build e-Heritage Project, and definitely we need this kind
of help from Microsoft, such as various, you know, geometric guides, and if
possible I'll direct my grad student here and work and solve this problem.
>> Sing Bing Kang: No more questions? Well, let's thank (inaudible).
(Applause)
Download