Discussion Class 4 Latent Semantic Indexing 1

advertisement
Discussion Class 4
Latent Semantic Indexing
1
Discussion Classes
Format:
Question
Ask a member of the class to answer.
Provide opportunity for others to comment.
When answering:
Stand up.
Give your name. Make sure that the TA hears it.
Speak clearly so that all the class can hear.
Suggestions:
Do not be shy at presenting partial answers.
Differing viewpoints are welcome.
2
Question 1: Basics
(a) Explain the name "latent semantic analysis"?
(b) What problems is latent semantic analysis attempting to
solve?
(c) What criteria were used in selecting singular-value
decomposition?
3
Question 2
• term
document
query
--- cosine > 0.9
4
Question 3: Rank Reduction
(a) Explain the matrices in the singular value
decomposition:
X = T0S0D0'
(b) The rank reduction method is to keep the first k
elements of S0 and set the others to zero. This gives:
^ = TSD'
~ X
X~
What has this to do with latent semantics?
5
Q4: Experimental Results: 100 Factors
(a) LSI-100 does better
at the right of this graph
than on the left. What
has this to do with
synonymy and
polysemy?
(b) Why were the
authors surprised that
TERM and SMART
gave similar results?
6
Question 5: Experimental Results
(a) Describe the methodology of the MED experiment.
(b) What conclusions can you draw from this experiment?
(c) The results of the CISI experiment were disappointing.
What are some possible explanations?
(d) This is a new method. What comes next?
7
Question 6: Number of Factors
What data does
this graph plot?
What conclusions
can you draw from
this graph?
8
Question 7: Performance
What does the paper say about the following?
(a) Storage requirements
(b) Efficiency of searching
(c) Updating of indexes
9
Download