V4-Coments and Improvement_comments

advertisement
Dear Editor,
We have revised our paper based on the comments and suggestions made by the reviewers. In
particular, we have improved the manuscript throughout regarding language and presentation
style, and we have also incorporated additional references as suggested by the reviewers.
Further, we have modified and revised the text of our manuscript considering all the specific
comments that were made.
The clarity of our presentation should now be much better, thanks to the extremely helpful and
valuable comments. Below, we provide our detailed responses to all recommendations and
requests made by the reviewers.
Yours truly,
Yuanfeng Zhu
Response to Reviewer 1:
Comments to the Author
Nice work. There are some nice ideas presented here. Ultimately, I believe this requires a fair
number improvements to the manuscript in order to be ready for publication:
- The grammar is very difficult to read, a good thorough edit is a must.
The paper has been extensively rewritten to improve grammar and clarity.
- The section numbers seem to have fallen off.
The section numbers are added now.
- Motion capture is introduced as a difficult problem on page 3, in the introduction, but we don't
learn what it is used for until page 19.
The challenge of using motion capture is addressed in the 2nd paragraph of the introduction.
- Are there other motion capture techniques to consider that are non-intrusive? Including
image-based one (such as painting lines directly on the users fingers and using cameras?
Alternatively, would Kinect-type technology work here?)
Motion capture may be possible for straight playing, but will not deal well with situations like
the finger crossover where parts of the hand are occluded. One of our goals is to be able to play
any arbitrary piece of input music. This would require significant adaptation of any captured
movement to fit the new music. In this work, we instead explore a generative approach, using
some limited motion capture data to improve realism.
- The related work section is severely lacking. There are many other references that are relevant
to this work. To pick one, for example, the use of a Trellis graph for optimizing a motion path
problem is not new, though no referenced mentions of it were made.
6 related papers have been added to the related work section. 5 papers are now mentioned that
use a Trellis graph for fingering generation, in the end of the first paragraph in the related work
section.
- The wrist rotations initial values are computed using pre-computed weights (Page 15, line 10),
how are these weights pre-computed?
The computation of the weights is discussed in the 2nd paragraph in Section 4.1.3.
- The hand model is not sufficiently described, it would be beneficial to point out the 21 joints
and using clear notation for how the rotation values for the 21 joints are articulated (or which
specifically are articulated by the system and which are not).
The detailed description of hand model used in our system is added into the 2nd paragraph in
Section 4 Finger and Hand Pose Calculation for Chords.
- The hand in the video looks very uncomfortable and not human like. It looks like some
proportion compromises had to be made in order to facilitate a solution that will hit the target.
We have tried to improve the realism of our system. Our hand model is based on the real hand
of a piano player, so that we can evaluate and regulate the hand parameters more effectively
and efficiently. The same hand model is used to generate all the demos to demonstrate a
comprehensive normal piano performance, including playing of scales, chords and a music piece,
and we also have generated the video comparison between real playing and the generated
animation in the demos called C Major skill and Chord.
Maybe the view points, the mesh skinned to the hand skeleton or some unnatural hand poses
influence the effect of the hand appearance during playing. Also, in future, one of the important
improvements is to generate standard piano playing for various sizes of hand models.
- It's very nice that you tackled the crossover of the thumb problem.
Thanks for your appreciation of this point.
- It sounds like the motion capture data is only used to improve the quality of the animation in
the wrist's translational motion. If so, why not use the motion capture data for more realistic
rotational components as well (in both the wrist and the finger)... this has been done before as
well.
The motion capture data is in fact widely used to extract parameters to drive the hand motion
both translation and rotation component in our system, but the data are usually not directly
used (some content on how to use motion data is not included in the paper because it might not
belong to the contribution of this paper), and we take some examples for better explanation as
follows:
The motion capture data is directly used to improve the wrist’s translational motion, as Section
6.1 Wrist motion between chords discusses.
Also, the motion capture data is indirectly used to evaluate the weights to generate hand poses
by: .
Vicon camera matrix is used to capture the wrist rotation for key poses of different finger crossovers,
chords and arpeggios by: attaching markers to wrist and middle finger base, attaching makers also to
piano keyboard surface to setup local coordinate system, and then evaluate wrist rotation max angel in
piano local coordinate system for various key poses.
Vicon camera matrix is also used to capture the rotation from each finger base to fingertip by
the above similar way, in order to evaluate the influence from how the finger rotation along
vertical axis influences the wrist rotation. This is talked with more detail in “Section 4.1.3
Initiate wrist orientation”.
Therefore, the captured data is also used to improve the reality of rotation components but are
not directly used to drive the animation. The reason not to directly use the captured data is
because we want to generate hand motion for various music with various speeds (which is
generally achieved for this paper) and for various-size hand models (which is considered as one
important future work), so it is impossible to generate piano animation driven only by captured
data.
- In the video the wrist translation looks linear and unnatural when large gaps are crossed.
This problem has been improved in the new demo by using spline interpolation on the wrist
translation component of the key hand poses.
- Very good that you tackled the non-instructed finger interdependence problem. However, it
sounds like you only consider the Y translation component. The fingers are interdependent in
rotation as well, among fingers and within the joints of a finger itself.
Yes. It would be meaningful to improve the animation quality of finger motion by evaluation of
how the music volume and speed influences the interdependent rotation between fingers and
within the joints of an instructed/non-instructed finger. We consider this future work and now
discuss it in the last paragraph in the conclusion.
- Though its aim is to be an instructional tool, no mention was made as to its effectiveness in
such a role (e.g., as determined through experiment, and analysis by piano teachers). It would
be great to show it to a piano teacher and ask how well this virtual student did... those results
would be interesting.
While one of the motivations for this work is to employ it in piano tutoring, the main focus of
the current paper is on animation generation. We have tried to adjust the positioning
appropriately. Evaluating the system as a tutoring tool is future work.
We do seek to show the work to a piano teacher and agree that this will provide interesting
feedback.
Overall, a good start at an interesting project. Though, I think, much is needed in terms of polish
and execution.
Thank you so much for so many valuable comments, which help a lot to improve our work.
Response to Reviewer 2
Comments to the Author
The paper is interesting and technically sound. There are many problems with the English, both
spelling and grammar, and these must be corrected in order to make the paper acceptable for
journal publication.
Thanks a lot for your appreciation of our work!
The paper has been extensively rewritten to improve grammar and clarity.
The paragraph on page 12, lines 16-36, has inconsistent directions. It states that -Z is "to the
left" but then says "-0.5 (move it to the right)".
This was an error, and the words "-0.5 (move it to the right)" have been replaced with "0.5
(move it to the right)".
Various costs are defined. For example, Cost(a,b,d) on page 9 and graphed in Figure 3. How was
the data obtained for this graph? It is uneven, which suggests some kind of experimental
measurement rather than a formula. Similarly, it would be useful to know how other costs were
estimated.
The explanation of how the costs are obtained has been added into Section 3.1.2 and Section
3.1.3. Also, for better understanding the various costs, we have rewritten Section 3.1.2 and 3.1.3.
The discussion focuses mainly on melodies and chords. Does the program work in other
situations? For example, a pianist often has to play both melody and (partial) accompaniment
with the right hand.
This is a good suggestion, because this is an often-used advanced performance, but we have not
considered yet. We have added this limitation as the future work, in the fifth point in the
conclusion and future work section.
Response to Reviewer 3
Comments to the Author
- a bit more elaboration on previous / related work would help in better understanding the
explanations that follow in the paper
In order to enrich the research background, 6 additional papers are now added and discussed in
the paper.
- including some figures / images would be helpful to visualize what is happening
All of the figures have been largely re-generated, and 3 new figures (Figure 6, 10 and 11) have
been added to help understand important processes.
- a few instances of minor grammatical errors (i.e: 'relax' instead of 'relaxed')
These errors are corrected now, and the paper has been rewritten for better clarity.
- Overall: very interesting application of ideas, intriguing blend of computational techniques,
music and virtual animation
Thank you very much for your feedback! We hope this research can help students a little to
better self-learn how to play the piano in the near future.
Download