MATLAB User Group - TerpConnect

advertisement
MATLAB vs. Alzheimer’s Support: A use of
Netscan to Compare Two Communities
Christina Pikas, College of Information Studies, University of Maryland, College Park
Abstract —I used the Microsoft Research Community Technology Group Netscan (http://netscan.research.microsoft.com)
and UsenetViews tools and Google Groups (http://groups.google.com) to examine and compare two Usenet groups:
comp-soft-sys.matlab and alt.support.alzheimers. Based on my investigations I describe what Netscan does and does not
tell the researcher.
I. MATLAB USER GROUP
Comp.soft-sys.matlab
A. Description
MATLAB is programming software used by scientists and engineers to do complex calculations
and modeling. There are many how to books, but the software has the reputation of being very
difficult to learn and use. The developer, The MathWorks, maintains a community page
(http://www.mathworks.com/matlabcentral/) where it allows customers to share code, participate
in contests, and read the comp.soft-sys.matlab (CSSM) newsgroup. Staff development and
technical support engineers regularly participate. There are about 10,000 members and about
70,000 messages per year. Posts contain code, error messages, and requests for assistance.
B. Who are the members of comp.soft-sys.matlab?
The members of CSSM are professional engineers and scientists who use MATLAB at work.
They post most heavily during the week and ask sophisticated but succinct questions. The
average length of posts for 2004 was 26 lines. Many other posters in addition to The MathWorks
employees use work addresses from education, government, and industry to identify themselves.
Of the top 40 authors for the first quarter of 2005, 40% only replied and did not start any threads.
I calculated the newsgroup crowd information for the first quarter of 2005 and provided a chart
below (See Chart 1). Posts are rarely cross-posted to other groups.
C. Interesting things
From my previous research on online engineering communities1, I found that communities that
had too high a load of students asking for basic help were less likely to be successful. This
community is active and appears attractive for students with MATLAB assignments.
Additionally, there are complaints about student users2. My initial thoughts were:
 Student posts are ignored
 The ratio of students to professionals is low
 My previous findings do not apply to this group; that is, student questions do not
adversely impact this online engineering community.
I looked at the number of unreplied messages (URM) as a percentage of the total number of
messages for October (prime homework season), July (summer break), and January (winter
break) 2000-2005. This does not take into account other things that might impact replies such as
vacations, winter storms, and conferences. See Table 1 for a comparison of the total number of
posts compared to the number of URM. I found no significant differences in the number of
URM based on the month of the year. Likewise, the number of one-time posters is not
significantly different depending on the month (See Table 2). By scanning the most recent
month’s activity, I found several student messages that received meaningful replies. Student
posts are not ignored but the load of students to professionals is low and the basic questions do
not impact the success CSSM.
Also, contrary to the opinion that USENET is dying, this group has shown continuous growth
from the first date archives are available in Google Groups. See the totals for each month on the
about page (http://groups.google.com/group/comp.soft-sys.matlab/about). The growth is shown
in Charts 2, 3, and 4 below.
II. ALZHEIMER’S SUPPORT GROUP
alt.support.alzheimers
A. Description
From the National Institute on Aging,
“Alzheimer's disease is the most common form of dementia among older people. It
involves the parts of the brain that control thought, memory, and language.”
(http://nihseniorhealth.gov/alzheimersdisease/defined/01.html, accessed 11/16/2005)
This group is a public group to support people with the disease and caregivers. It has about 700
members and 7,000 messages a year. See Chart 5 for a graph of the decline of this group. Posts
relay concerns about sick family members, legal questions about caregivers and wills, and
requests for advice and support.
B. Who are the members of alt.support.alzheimers?
In contrast to the MATLAB group, the members of this group do not generally post from work,
but from personal addresses with AOL, hotmail, yahoo, etc. domains. There are many more offtopic messages and a few spammers. The average message line length for 2004 was 45 lines -quite a bit longer than CSSM. The majority of the posters identify themselves as women
caregivers to sick family members and spouses.
C. Interesting things
I selected this group to be able to use the USENETviews software provided on DVD-ROM from
Microsoft Research. I immediately located a spammer using the Newsgroup Crowd
visualization. See the Newsgroup Crowd and the AuthorLines for the spammer in the charts
below (6 and 7). It is interesting that trolls were more apparent in a health-related group – one
where the activity could be more hurtful. The surprisingly high number of URM (28% of all
messages) is explained by looking at the subjects. The three on the first page that were not
obviously spam, were thank you messages for assistance received.
III. FUTURE WORK AND WHAT NETSCAN DOES NOT TELL US
The MATLAB group has grown continuously while the Alzheimer’s group has not. This could
be due to competing online communities. Further work would include surveying available
Alzheimer communities, comparing the growth/decline over time, and track the migration of the
members. Interviews might explain why these people moved.
Future work might include interviewing participants to see why they join the MATLAB group.
Many do not show membership in other USENET groups. Do they trust this group more
because MathWorks employees answer questions? Why do they post instead of calling technical
support or asking colleagues? How much research do they do before posting a question?
Netscan gathers more statistics in one place than are available on Google Groups or elsewhere.
Together with UsenetViews, it provides visualization tools to help the researcher find patterns
and explain the usage of the group. It does not, however, support more qualitative
investigations. Mining the posts for concepts has to be done via Google. It would be helpful if
Netscan or UsenetViews supported searching and compiling data mined from the content. Also,
the numbers do not match Google’s so there are discrepancies that should be addressed.
NOTES
Christina K. Pikas, “Fostering Collaboration in Engineering Communities of Practice: What Works and What
Doesn’t” October 20, 2005. Forthcoming on the Communities of Practice Learning Center website.
2
One thread with these complaints includes the following two posts:
1
Date: Sun, 4 Apr 2004 09:02:47 -0500
Subject: ATTENTION STUDENTS POSTING TO COMP.SOFT-SYS.MATLAB
Begging for help doesn't work here. Most of the respondents here are paid
professionals who are not looking to teach you in a few posts a subject that
you have failed to learn in a semester of classes. Nor are we in the business
of providing off-the-shelf solutions to your programming problems. The urgency
of your problem does not apply to us and your gratitude has little value. Your
gratitude might have some value someday if you learn to be a competent
engineer, scientist, or whatever you are doing... but that won't happen if your
strategy is to try to get someone else to do your work.
…
Date: Sun, 4 Apr 2004 12:52:46 -0500
Ditto [deleted] response, Us. Personally, I'm getting very tired of seeing
students abuse this ng. Just stopped myself from posting a nasty (-ish) reply
to another this morning. The problem is, it's just TOO DAMNED EASY for lazy
students to get answers to their questions here. No thought required.
I'm less and less inclined these days to bother replying at all.
IV. TABLES
Table 1: Number of Unreplied Messages in Octobers and Julys, 2000-2005
URM
Total Messages % URM
Oct-05 896
6240
14%
Oct-04 781
6260
12%
Oct-03 447
3763
12%
Oct-02 406
3284
12%
Oct-01 345
2600
13%
Oct-00 167
1972
8%
Jul-05
Jul-04
Jul-03
Jul-02
Jul-01
Jul-00
URM
858
724
353
355
255
159
Total Messages
5766
5966
3145
2656
1807
1635
% URM
15%
12%
11%
13%
14%
10%
Jan-05
Jan-04
Jan-03
Jan-02
Jan-01
Jan-00
URM
666
438
399
295
150
119
Total Messages
6659
4097
3153
2909
1478
1047
% URM
10%
11%
13%
10%
10%
11%
Table 2: Comparison of the percentage of one-time posters
1x Posters Total People % 1x posters
Oct-05
1206
2054
59%
Oct-04
1323
2152
61%
Oct-03
773
1332
58%
Oct-02
688
1172
59%
Oct-01
551
947
58%
Oct-00
349
621
56%
1x Posters
Total People
% 1x posters
Jul-05
Jul-04
Jul-03
Jul-02
Jul-01
Jul-00
1102
1127
559
570
430
311
1880
1926
1033
1000
718
552
59%
59%
54%
57%
60%
56%
Jan-05
Jan-04
Jan-03
Jan-02
Jan-01
Jan-00
1x Posters
1283
840
615
514
341
283
Total People
2133
1373
1103
927
550
431
% 1x posters
60%
61%
56%
55%
62%
66%
V. CHARTS
Chart 1: Manually calculated Newsgroup Crowd
Newsgroup Crowd
comp.soft-sys.matlab for 2005 Q1*
90
Days Active in Quarter
80
70
60
50
40
30
20
10
0
0.75
0.95
1.15
1.35
1.55
1.75
Avg Posts per Thread
*I reviewed the pieces that MSR CTG used to automate a newsgroup crowd chart for the
DVD and manually found that information for the top 20 authors. Author's lifetime
participation was retrieved from Google Groups using an author search.
Chart 2:
Total People per Month in
comp.soft-sys.matlab
2500
2000
1500
1000
500
Ja
n00
Ju
l-0
0
Ja
n01
Ju
l-0
1
Ja
n02
Ju
l-0
2
Ja
n03
Ju
l-0
3
Ja
n04
Ju
l-0
4
Ja
n05
Ju
l-0
5
0
Chart 3:
Total Messages per Month in
comp.soft-sys.matlab
7000
6000
5000
4000
3000
2000
1000
05
nJa
04
nJa
03
nJa
02
nJa
01
nJa
Ja
n-
00
0
Chart 4:
Number of Posts in comp.soft-sys.matlab
7000
6000
5000
4000
3000
2000
1000
Ja
n05
Ja
n04
Ja
n03
Ja
n02
Ja
n01
Ja
n00
Ja
n99
Ja
n98
Ja
n97
Ja
n96
Ja
n95
Ja
n94
Ja
n93
0
Chart 5:
Number of Posts in alt.support.alzheimers
2500
2000
1500
1000
500
Au
g05
Au
g04
Au
g03
Au
g02
Au
g01
Au
g00
Au
g99
Au
g98
Au
g97
0
Chart 6:
Spammer, many posts
per thread, few days
activity
Chart 7:
Author lines for user 42214030 for 2004 in alt.support.alzheimers. Note: high numbers of posts
on single dates with few posts in each thread. Further examination shows these posts to be spam
Download