[#SAK-15780] SectionManager caching breaks multiple section

advertisement
[SAK-15780] SectionManager caching breaks multiple section member updates
in a single thread Created: 25-Feb-2009 Updated: 24-Nov-2009 Resolved: 26-Feb-2009
Status:
Project:
Component/s:
Affects
Version/s:
Fix Version/s:
Closed
Sakai
Section Info
2.5.2, 2.5.3, 2.5.4, 2.6.0
Type:
Reporter:
Resolution:
Labels:
Remaining
Estimate:
Time Spent:
Original
Estimate:
Bug
Ray Davis
Fixed
None
Not Specified
Issue Links:
Relate
relates to SAK-13400 Add thread-local caching to SectionMa...
relates to SAK-16602 Section Memberships screen not updati...
2.5.5, 2.6.0, 2.7.0
Priority:
Assignee:
Votes:
Critical
Oliver Heyer
0
Not Specified
Not Specified
Closed
Closed
Previous Issue
Keys:
Description
While loading test data, I found that when I tried adding members to more than one section in a
single thread, changes would only show up for one section. For example, if I tried adding 70
students to "Disc 1" and 70 students to "Disc 2", at the end of the run, "Disc 1" would have 70
members but "Disc 2" would have 0 (or vice versa). Tomcat logs, the Sakai events table, and
DB queries while the thread was paused all seemed to indicate that the updates were actually
progressing, but then were somehow rolled back.
I've tracked the problem down to two ThreadLocal caches used by:
sections/sectionsimpl/sakai/impl/src/java/org/sakaiproject/component/section/sakai/SectionManagerImpl.java
Details in comments.
Comments
Comment by Ray Davis [ 25-Feb-2009 ]
 The findGroup method gets and caches a site Group object from SiteService findGroup.
That object includes a reference to a Site object.

The getSite method gets and caches a different copy of the Site object, which may in
turn end up getting references to different copies of Group objects in the
SectionManagerImpl cache.
First call to addStudentToSection:
1. Gets and caches a site Group object for the section (with its own copy of the Site object).
2. Calls dropEnrollmentFromCategory, which...
2.a. Gets and caches a new copy of the Site object, which ends up with its own cached copies of
the groups.
2.b. Always updates every group's membership list with what it thinks the current state is,
whether the student being dropped is already a member or not.
3. Calls group.addMember, which won't make any changes to the cached site's copy of the
groups.
4. Calls SiteService saveGroupMemberships to save the state from the cached group's point of
view.
Second call to addStudentToSection, specifying a different student and a different section:
1. Gets and caches a site Group object for the new section (with its own copy of the Site object).
2. Calls dropEnrollmentFromCategory, which...
2.a. Fetches the cached copy of the Site object, containing the obsolete copies of the groups.
2.b. Updates the first section's group's membership list with the incorrect (empty) cached state,
undoing the first call to addStudentToSection.
3. Goes on to deal with the second section, leaving the first section rolled back.
Comment by Ray Davis [ 25-Feb-2009 ]
One easy (if not necessarily optimal) fix is to clear the cached site object when
saveGroupMemberships is called.
It might also be a good idea to cut down on unnecessary noise by changing
dropEnrollmentFromCategory to ignore groups which don't contain the specified enrollment.
We'd need to make sure the apparent redundancy wasn't put in for some undocumented reason,
though.
Comment by Ray Davis [ 25-Feb-2009 ]
This was triggered by the caches implemented for the performance issue . The basic design flaw
was that the two caches end up referring to different copies of the same objects. We need to
ensure that the same objects are reached no matter how we reference them. This may make the
code more complicated, but it should fix the bug.
Josh doesn't remember any reason for dropEnrollmentFromCategory to be so exuberant with its
updates, and there are no comments to justify it, so I'll change that too.
Comment by Ray Davis [ 25-Feb-2009 ]
I just committed a fix (I hope) to our local 2.5.x-based branch. I'm no longer seeing the
functional bug. The dropEnrollmentFromCategory change sped up my test case (distributing
140 students across 3 sections) from 2 min 18 sec to 42 sec.
I'll look into merging to trunk tomorrow.
Comment by Stephen Marquard [ 25-Feb-2009 ]
Ray, is there any UI action in Section Info that would trigger this bug, or does it only show up
when calling SectionManager from other code?
Comment by Ray Davis [ 26-Feb-2009 ]
So far I haven't found any code paths in the Section Info tool that would trigger the functional
bug, which explains why no one's complained about it so far.
The performance bug may be another matter.
Comment by Stephen Marquard [ 26-Feb-2009 ]
The most critical performance issue for us has been the student options (signup, switch) which
doesn't have dropEnrollmentFromCategory in the code path.
We haven't noticed other specific performance issues on the instructor/TA options (not to say
this wasn't an issue), but they're used much less frequently, so have smaller total impact. Good
to have it fixed though.
Comment by Ray Davis [ 26-Feb-2009 ]
A straightforward
sections> svn merge -c 58022
https://source.sakaiproject.org/svn/msub/berkeley.edu/bspace/sections/sakai_2-5-x .
seemed to work fine in my brief testing, so I've checked this into trunk (rev. 58044) and
recommend it also be considered for 2.5.x and 2.6.x.
Comment by Stephen Marquard [ 20-Mar-2009 ]
Notes for QA: this isn't exposed in the UI, so QA testing should focus on regression testing for
existing functionality.
Comment by Peter Peterson [ 16-May-2009 ]
QA SUMMARY - regression tested existing functionality per Stephen Marquard's Jira notes for
QA

functionality seems to perform as expected
QA PASS
Comment by Jean-François Lévêque [ 18-May-2009 ]
Has regression testing been done on 2.7? AFAICT, this isn't in 2.6 yet.
Comment by Anthony Whyte [ 18-May-2009 ]
merged 2.6.0, r62590.
Comment by Jean-François Lévêque [ 19-May-2009 ]
2.5.x merge r62592
Comment by Anthony Whyte [ 22-Jun-2009 ]
Merged to 2.5.5 branch for skai-2.5.5-rc01.
Comment by Peter Peterson [ 28-Sep-2009 ]
In 2.6.0; removed 2.6.x as fix version in order to assure a clean 2.6.1 filter.
Generated at Sun Mar 06 04:15:10 CST 2016 using JIRA 6.4.11#64026sha1:78f6ec473a3f058bd5d6c30e9319c7ab376bdb9c.
Download