R N C

advertisement
Request for New Course
EASTERN MICHIGAN UNIVERSITY
DIVISION OF ACADEMIC AFFAIRS
REQUEST FOR NEW COURSE
DEPARTMENT/SCHOOL: ____COMPUTER SCIENCE______________________COLLEGE:
ARTS AND SCIENCE
CONTACT PERSON: _______WILLIAM MCMILLAN__________________________________________________________________
CONTACT PHONE:
7-1063
CONTACT EMAIL:
WMCMILLAN@EMICH.EDU
REQUESTED START DATE: TERM____FALL_________YEAR____2012_______
A. Rationale/Justification for the Course
In a September 15, 2008 piece in Scientific American, Nigel Shadbolt and Tim Berners-Lee wrote, “…the Web is more
than the sum of its pages. Vast emergent properties have arisen that are transforming society. … A new branch of
science—Web science—aims to address [these phenomena].”1
The World Wide Web, the tools used to interact with it, its many user communities, its vast and complex data stores,
and its relationships with all systems that characterize modern civilization define collectively a natural system that is
increasingly the subject of empirical scientific study. Just as biological life forms have evolved into complex systems
deserving of scientific study, this Internet-based amalgamation has evolved to be an important subsystem of the natural
environment. No single collection of people has designed it explicitly or can derive through pure logic its future
behavior. It is neither a purely technological system nor a purely human system. It is a system composed of many
interacting heterogeneous subsystems, similar in many ways to geologic or biologic systems.
This course samples from the many empirical research techniques that have been employed in computer science to give
a general education student experience in carrying out laboratory studies in the field and an introductory theoretical
view of the Web as a natural system. The theoretical foundations for the course, employed across topics, can be
grouped into these major categories:
•
•
•
•
1
Principles of Computational Thinking2: Algorithms, distributed computing, semantic modeling, heuristic search,
pattern matching.
Complex and chaotic systems: Emergent properties of complex, dynamic systems that are sensitive to initial
conditions, “fractal” in nature, and have the appearance of being non-deterministic.
Patterns of Web use by individuals and groups: Theories of human-computer interaction such as the GOMS model,
ways in which computing technology can enable social interaction, Web ecosystems, privacy and ethics principles.
Techniques and theories supporting the computing infrastructure: Computer networking, database structures, data
representation, methods of managing contention for common resources, techniques to ensure data security.
http://www.scientificamerican.com/article.cfm?id=web-science
2
Computational thinking is a phrase coined by Jeanette Wing of Carnegie Mellon University to refer to the generalized principles and theories
of computer science that can be employed across multiple domains.
Miller, New Course
Sept. 09
New Course Form
B. Course Information
1. Subject Code and Course Number:
2. Course Title:
COSC 104
Web Science
3. Credit Hours:
3
4. Repeatable for Credit? Yes_______
No__x____
If “Yes”, how many total credits may be earned?_______
5. Catalog Description (Limit to approximately 50 words.):
Empirical study of the global, emergent systems facilitated by the World Wide Web. Computing infrastructure that
enables entities to interact and share information via dynamic, virtual systems. Theoretical foundations such as
chaotic and complex systems, computational thinking, and virtual interaction spaces. Empirical methods. Forming
hypotheses from general principles and testing them through data collection and analysis.
6. Method of Delivery (Check all that apply.)
a. Standard (lecture/lab) X
On Campus
X
Off Campus
b. Fully Online
c. Hybrid/ Web Enhanced
7. Grading Mode:
Normal (A-E)
X
Credit/No Credit
8. Prerequisites: Courses that MUST be completed before a student can take this course. (List by Subject Code, Number and Title.)
None.
9. Concurrent Prerequisites:
Code, Number and Title.)
Courses listed in #5 that MAY also be taken at the same time as a student is taking this course. (List by Subject
None.
10. Corequisites: Courses that MUST be taken at the same time as a student in taking this course.
(List by Subject Code, Number and
Title.)
None.
11. Equivalent Courses. A student may not earn credit for both a course and its equivalent. A course will count as a repeat if an equivalent
course has already been taken. (List by Subject Code, Number and Title)
None.
12. Course Restrictions:
Miller, New Course
Sept. ‘09
Page 2 of 5
New Course Form
a. Restriction by College. Is admission to a specific College Required?
College of Business
Yes
No
x
College of Education
Yes
No
x
b. Restriction by Major/Program. Will only students in certain majors/programs be allowed to take this course?
Yes
No
X
If “Yes”, list the majors/programs
c. Restriction by Class Level Check all those who will be allowed to take the course:
Undergraduate
Graduate
All undergraduates____X___
All graduate students____
Freshperson
Certificate
Sophomore
Masters
Junior
Specialist
Senior
Doctoral
Second Bachelor___X_____
UG Degree Pending_____
Post-Bac. Tchr. Cert._____
Low GPA Admit_______
Note: If this is a 400-level course to be offered for graduate credit, attach Approval Form for 400-level Course for Graduate
Credit. Only “Approved for Graduate Credit” undergraduate courses may be included on graduate programs of study.
Note: Only 500-level graduate courses can be taken by undergraduate students. Undergraduate students may not register for
600-level courses
d. Restriction by Permission. Will Departmental Permission be required?
Yes
No
(Note: Department permission requires the department to enter authorization for every student registering.)
13. Will the course be offered as part of the General Education Program?
Yes
X
X
No
If “Yes”, attach Request for Inclusion of a Course in the General Education Program: Education for Participation in the Global Community
form. Note: All new courses proposed for inclusion in this program will be reviewed by the General Education Advisory Committee. If this
course is NOT approved for inclusion in the General Education program, will it still be offered? Yes
No
X
C. Relationship to Existing Courses
Within the Department:
14. Will this course will be a requirement or restricted elective in any existing program(s)? Yes
No
X
If “Yes”, list the programs and attach a copy of the programs that clearly shows the place the new course will have in the curriculum.
Program
Required
Restricted Elective
Program
Required
Restricted Elective
15. Will this course replace an existing course? Yes
Miller, New Course
Sept. ‘09
No
X
Page 3 of 5
New Course Form
16. (Complete only if the answer to #15 is “Yes.”)
a. Subject Code, Number and Title of course to be replaced:
b. Will the course to be replaced be deleted?
Yes
No
17. (Complete only if the answer #16b is “Yes.”) If the replaced course is to be deleted, it is not necessary to submit a Request for
Graduate and Undergraduate Course Deletion.
a. When is the last time it will be offered?
Term
Year
b. Is the course to be deleted required by programs in other departments?
Contact the Course and Program Development Office if necessary.
Yes
No
c. If “Yes”, do the affected departments support this change?
Yes
No
If “Yes”, attach letters of support. If “No”, attach letters from the affected department explaining the lack of support, if available.
Outside the Department: The following information must be provided. Contact the Course and Program Development office for
assistance if necessary.
18. Are there similar courses offered in other University Departments?
If “Yes”, list courses by Subject Code, Number and Title
Yes
No
X
19. If similar courses exist, do the departments in which they are offered support the proposed course?
Yes
No
If “Yes”, attach letters of support from the affected departments. If “No”, attach letters from the affected department explaining the lack of
support, if available.
D. Course Requirements
20. Attach a detailed Sample Course Syllabus including:
a.
b.
c.
d.
e.
f.
g.
h.
Course goals, objectives and/or student learning outcomes
Outline of the content to be covered
Student assignments including presentations, research papers, exams, etc.
Method of evaluation
Grading scale (if a graduate course, include graduate grading scale)
Special requirements
Bibliography, supplemental reading list
Other pertinent information.
NOTE: COURSES BEING PROPOSED FOR INCLUSION IN THE EDUCATION FOR PARTICIPATION IN THE GLOBAL
COMMUNITY PROGRAM MUST USE THE SYLLABUS TEMPLATE PROVIDED BY THE GENERAL EDUCATION
ADVISORY COMMITTEE. THE TEMPLATE IS ATTACHED TO THE REQUEST FOR INCLUSION OF A COURSE IN THE
GENERAL EDUCATION PROGRAM: EDUCATION FOR PARTICIPATION IN THE GLOBAL COMMUNITY FORM.
Miller, New Course
Sept. ‘09
Page 4 of 5
New Course Form
E. Cost Analysis (Complete only if the course will require additional University resources.
Fill in Estimated Resources for the
sponsoring department(s). Attach separate estimates for other affected departments.)
Estimated Resources:
Year One
Year Two
Year Three
Faculty / Staff
$_________
$_________
$_________
SS&M
$_________
$_________
$_________
Equipment
$_________
$_________
$_________
Total
$____0_____
$____0_____
$_0________
F. Action of the Department/School and College
1. Department/School
Vote of faculty: For ___15____
Against ____0____
Abstentions ___0_____
(Enter the number of votes cast in each category.)
<signed> W. W. McMillan
Department Head/School Director Signature
12 Mar 2012
Date
2. College/Graduate School
A. College
College Dean Signature
Date
B. Graduate School (if Graduate Course)
Graduate Dean Signature
Date
G. Approval
Associate Vice-President for Academic Programming Signature
Miller, New Course
Sept. ‘09
Date
Page 5 of 5
COSC 104: Web Science
MASTER SYLLABUS
Rationale (for Gen Ed Inclusion):
In a September 15, 2008 piece in Scientific American, Nigel Shadbolt and Tim Berners-Lee
wrote, “…the Web is more than the sum of its pages. Vast emergent properties have arisen that
are transforming society. … A new branch of science—Web science—aims to address [these
phenomena].”1
The World Wide Web, the tools used to interact with it, its many user communities, its vast and
complex data stores, and its relationships with all systems that characterize modern civilization
define collectively a natural system that is increasingly the subject of empirical scientific study.
Just as biological life forms have evolved into complex systems deserving of scientific study,
this Internet-based amalgamation has evolved to be an important subsystem of the natural
environment. No single collection of people has designed it explicitly or can derive through pure
logic its future behavior. It is neither a purely technological system nor a purely human system.
It is a system composed of many interacting heterogeneous subsystems, similar in many ways to
geologic or biologic systems.
This course samples from the many empirical research techniques that have been employed in
computer science to give a general education student experience in carrying out laboratory
studies in the field and an introductory theoretical view of the Web as a natural system. The
theoretical foundations for the course, employed across topics, can be grouped into these major
categories:
•
•
•
•
1
Principles of Computational Thinking2: Algorithms, distributed computing, semantic
modeling, heuristic search, pattern matching.
Complex and chaotic systems: Emergent properties of complex, dynamic systems that are
sensitive to initial conditions, “fractal” in nature, and have the appearance of being nondeterministic.
Patterns of Web use by individuals and groups: Theories of human-computer interaction such
as the GOMS model, ways in which computing technology can enable social interaction,
Web ecosystems, privacy and ethics principles.
Techniques and theories supporting the computing infrastructure: Computer networking,
database structures, data representation, methods of managing contention for common
resources, techniques to ensure data security.
http://www.scientificamerican.com/article.cfm?id=web-science
2
Computational thinking is a phrase coined by Jeanette Wing of Carnegie Mellon University to
refer to the generalized principles and theories of computer science that can be employed across
multiple domains.
1
Course Description:
Empirical study of the global, emergent systems facilitated by the World Wide Web. Computing
infrastructure that enables entities to interact and share information via dynamic, virtual
systems. Theoretical foundations such as chaotic and complex systems, computational thinking,
and virtual interaction spaces.
Empirical methods. Forming hypotheses from general
principles and testing them through data collection and analysis.
Course Credits: 3.0
Course Pre-requisites: None.
Course Goals:
Students will be able to:
1. explain and apply the theoretical foundations of the natural science of the web comprising
computational principles, complex and chaotic systems, human subsystems, and principles of
infrastructure of distributed computing systems;
2. design and implement empirical studies in order to test hypotheses derived from theories of
the Web;
3. proficiently use computing environments that enable the interaction and sharing of
information in a disciplined, effective, and secure manner; and
4. provide and explain examples of dynamic virtual Web systems and their use in interacting
and sharing information.
Method of Evaluation:
Laboratory Reports 40%
Homework Exercises 15%
Quizzes and Exams 45%
Grading Scale:
A: 93-100
A-: 90-92
B+: 87-89
B: 83-86
B-:
C+:
C:
C-:
80-82
77-79
73-76
70-72
D+:
D:
D-:
E:
67-69
63-66
60-62
< 60
Suggested Textbooks:
1. Coursepack and laboratory manual (currently under development).
2. Mining the Social Web by Matthew Russel (O'Reilly Media, Incorporated, 2011. ISBN13:
9781449388348; ISBN: 1449388345)
3. Introductory Statistics for Engineering Experimentation by Peter R. Nelson, Karen A.F.
Copeland, Marie Coffin (Elsevier, 2003. ISBN: 98780125154239)
2
Course Topics and Schedule3:
1. The Web as a Natural System (1 week)
a. Importance to modern civilization of virtual, Web-based systems
b. Comparison of the Web’s complexity and evolutionary development to, e.g.,
biological, psychological, and geologic systems.
c. Overview of underlying theories
d. Introduction to empirical methods
2. The Web as a Data Environment (1.5 week)
a. Web pages and documents as an evolving set of emergent databases
b. Indexing methods and effects on searching
c. Comparing existing search technologies
i. Precision versus coverage
ii. Signal detection - Receiver Operating Characteristics
3. Connectivity and Routing between Web-Connected Elements (2 weeks)
a. Ways in which computing, data, and human elements are connected on the Web
b. How information is transmitted
c. Routing and reassembly of information
d. Identification of subsystems on the Web
e. Determining packets’ paths with tools such as Traceroute
i. Identifying bottlenecks
ii. Effect on traffic of temporal cycles, e.g., time-of-day
iii. Measuring logical path length as number of hops
f. Getting information about Web nodes using, e.g., Ping
i. End-to-end delay
ii. Nature of units of information transmitted, e.g., packet size
g. Resolving Web addresses via DNS lookup
4. Web-Based Communities (1.5 weeks)
a. Social spaces on the Web
b. Kinds of Web interactions
c. Collective community actions
d. Individuals’ behavior in relation to Web communities (identity, reputation, presence,
and engagement)
5. Data on the Web (2 weeks)
a. A closer look at kinds of data
b. Methods of representation and storage
c. Quality of data vs. efficiency
d. Computing infrastructure to support data repositories and interconnectivity
e. Virtual subsystems on the Web as emergent databases
f. The Cloud as a popular way to conceptualize virtual subsystems
6. Privacy and Security (1.5 weeks)
a. Sources and nature of threats and vulnerabilities
3
This topic outline focuses on concepts and theories covered in lecture and discussion. Related
laboratory exercises are specified separately.
3
b. Communities’ collective defenses as immune responses
c. Techniques for ensuring security
d. Accessibility vs. privacy
7. Individuals’ Usage Patterns (1 week)
a. The user as an element in Web dialogues
b. Categories of patterns of use, including:
i. Means ends analysis
ii. Foraging (short-horizon means-ends)
iii. Hedonistic
c. Individuals’ assessment of, and reactions to, risks on the Web
8. Artificial Intelligence and the Web (1.5 weeks)
a. AI as heuristic reasoning and state space search
b. Web-based systems that employ AI
c. Modeling of users by subsystems on the Web
i. Recommender systems
ii. Profiling to predict likely actions
d. Finding meaningful patterns via data mining
9. Distributed Computing (1.5 weeks)
a. Web-based subsystems as computational engines
b. Non-centralized control of computation compared to hierarchical control
c. Developing virtual computing structures via the Cloud
d. Infrastructure
i. Technological
ii. Human
e. Maintaining integrity of data and computational results
10. The future of the Web (1 week)
a. Society’s perceptions of the Web, its effects, and its directions (employing popular
literature and commentary)
b. Students’ predictions and evaluations of emerging Web-based entities and forces
c. Ethics in the age of the Web
Academic Dishonesty Policy:
Academic dishonesty, including all forms of cheating, falsification, and/or plagiarism, will not be
tolerated in this course. Each student is expected to submit individually prepared work.
Any instance of academic dishonesty during any exam will result in an automatic failing grade
for the course. You are free to discuss problems or questions on laboratory exercises, or projects
with your classmates. However, the submitted work should reflect the individual student’s
efforts. Any student who submits work, which has been determined as not his/her own, will be
given a grade of ZERO for the first offense. In addition, ALL other students involved will also
be given a grade of ZERO. Any subsequent instance of academic dishonesty will result in a
failing grade for the course.
In addition, you may be referred to the Office of Student Judicial Services for discipline that can
result in either a suspension or permanent dismissal. The Student Conduct Code contains
4
detailed definitions of what constitutes academic dishonesty. If you are not sure about whether
something you are doing would be considered academic dishonesty, consult with the course
instructor. You may access the Code and other helpful resources online at
http://www.emich.edu/sjs.
Students with Disabilities:
If you require special arrangements due to a disability, please see the instructor as early in the
term as possible.
Bibliography
Search Engines
1. Alfano, Marco and Biagio Lenzitti. “A web search methodology for different user
typologies,” CompSysTech '09: Proceedings of the International Conference on Computer
Systems and Technologies and Workshop for PhD Students in Computing, June 2009.
2. Feng, Juan and Xiaoquan (Michael) Zhang. “Dynamic price competition on the internet:
advertising auctions,” EC '07: Proceedings of the 8th ACM conference on Electronic
commerce, June 2007.
3. Ghose, Anindya and Sha Yang “Analyzing search engine advertising: firm behavior and
cross-selling in electronic markets,” WWW '08: Proceeding of the 17th international
conference on World Wide Web, April 2008.
4. Jansen, Bernard J. “The comparative effectiveness of sponsored and nonsponsored links for
Web e-commerce queries,” Transactions on the Web (TWEB) , Volume 1 Issue 1, May 2007.
5. Juan, Yun-Fang and Chi-Chao Chang. “An analysis of search engine switching behavior
using click streams,” WWW '05: Special interest tracks and posters of the 14th international
conference on World Wide Web, May 2005.
6. Lahaie, Sébastien and David M. Pennock. “Revenue analysis of a family of ranking rules for
keyword auctions,” EC '07: Proceedings of the 8th ACM conference on Electronic
commerce, June 2007.
7. McCown, Frank and Michael L. Nelson. “Agreeing to disagree: search engines and their
public interfaces,” JCDL '07: Proceedings of the 7th ACM/IEEE-CS joint conference on
Digital libraries, June 2007.
8. Poblete, Barbara and Ricardo Baeza-Yates. “Query-sets: using implicit feedback and query
patterns to organize web documents,” Proceedings of the 17th international conference on
World Wide Web (New York, NY, 2008).
9. Poremsky, Diane. Google and Other Search Engines: Visual Quickstart Guide (Peachpit
Press, May, 2004).
10. Sun, Jian-Tao, Xuanhui Wang, Dou Shen, Hua-Jun Zeng, and Zheng Chen. “CWS: a
comparative web search system,” WWW '06: Proceedings of the 15th international
conference on World Wide Web, May 2006.
11. Vrochidis, Stefanos, Ioannis Kompatsiaris, and Ioannis Patras. “Optimizing visual search
with implicit user feedback in interactive video retrieval,” CIVR '10: Proceedings of the
ACM International Conference on Image and Video Retrieval, July 2010.
Connectivity and Routing
12. Ahsan, Habib Md. and Abrams Marc. “Analysis of Bottlenecks in International Internet
5
Links,” Analysis of Bottlenecks in International Internet Links. Virginia Polytechnic Institute
& State University Blacksburg, VA, USA ©1998.
13. Bickerstaff, Cindy, Ken True, Charles Smothers, Tod Oace, Jeff Sedayao, and Clinton Wong.
“Don't just talk about the weather - manage it! a system for measuring, monitoring, and
managing internet performance and connectivity,” NETA'99 Proceedings of the 1st
conference on Conference on Network Administration - Volume 1, USENIX Association
Berkeley, CA, USA ©1999.
14. Chen, Thomas M. “Increasing the observability of Internet behavior,” Communications of
the ACM, Volume 44 Issue 1, Jan. 2001. ACM New york, NY, USA.
15. Fan, Xun and John Heidemann. “Selecting representative IP addresses for internet topology
studies,” IMC '10 Proceedings of the 10th annual conference on Internet measurement.
(ACM New York, NY, USA ©2010 ISBN: 978-1-4503-0483-2)
16. Kar, Dulal C. “Internet path characterization using common internet tools,” Journal of
Computing Sciences in Colleges. Volume 18 Issue 4, April 2003. Consortium for Computing
Sciences in Colleges , USA.
17. Logg, Connie, Les Cottrell and Jiri Navratil. “Experiences in traceroute and available
bandwidth change analysis,” NetT '04 Proceedings of the ACM SIGCOMM workshop on
Network troubleshooting: research, theory and operations practice meet malfunctioning
reality (ACM New York, NY, USA ©2004. ISBN:1-58113-942-X).
18. Olivieira, Ricardo V., Beichuan Zhang and Lixia Zhang. “Observing the evolution of internet
as topology,” SIGCOMM '07 Proceedings of the 2007 conference on Applications,
technologies, architectures, and protocols for computer communications. ISBN: 978-159593-713-1. ACM SIGCOMM Computer Communication Review, Volume 37 Issue 4,
October 2007. ACM New York, NY, USA.
19. Rasti, Amir H., Nazanin Magharei, Reza Rejaie, and Walter Willinger. “Eyeball ASes: from
geography to connectivity,” IMC '10 Proceedings of the 10th annual conference on Internet
measurement. ACM New York, NY, USA ©2010.
20. Viger, Fabien Viger, Brice Augustin, Xavier Cuvellier, Clémence Magnien, Matthieu Latapy,
Timur Friedman, and Renata Teixeira. “Detection, understanding, and prevention of
traceroute measurement artifacts,” Computer Networks: The International Journal of
Computer and Telecommunications Networking, Volume 52 Issue 5, April, 2008.
User Communities
21. Calvi, Licia. “Personal networks as a case for online communities: two case studies.”
International Journal of Web Based Communities, Volume 5 Issue 1. November 2008.
22. Paul, Sheila A., Marianne Jensen, Chui Yin Wong and Chee Weng Khong. “Socializing in
mobile gaming.” DIMEA '08: Proceedings of the 3rd international conference on Digital
Interactive Media in Entertainment and Arts. September 2008.
23. Robu, Valentin, Harry Halpin and Hana Shepherd. “Emergence of consensus and shared
vocabularies in collaborative tagging systems.” Transactions on the Web (TWEB), Volume 3
Issue 4. September 2009.
24. Sosa, Manuel E.. “Where Do Creative Interactions Come From? The Role of Tie Content and
Social Networks.” Organization Science, Volume 22 Issue 1. January 2011.
25. Torkjazi, Mojtaba, Reza Rejaie and Walter Willinger. “Hot today, gone tomorrow: on the
migration of MySpace users.” WOSN '09: Proceedings of the 2nd ACM workshop on Online
6
social networks. August 2009.
26. Valafar, Masoud, Reza Rejaie and Walter Willinger. “Beyond friendship graphs: a study of
user interactions in Flickr.” WOSN '09: Proceedings of the 2nd ACM workshop on Online
social networks. August 2009
Data
27. Chierichetti, Flavio, Silvio Lattanzi and Alessandro Panconesi. “Gossiping (via mobile?) in
social networks.” DIALM-POMC '08: Proceedings of the fifth international workshop on
Foundations of mobile computing. August 2008.
28. Huang, Lailei and Zhengyou Xia. “User Character and Communication Pattern Detecting on
Social Network Site.” ICEC '09: Proceedings of the 2009 International Conference on
Engineering Computation. May 2009.
29. Leung, Cane Wing-ki, Ee-Peng Lim, David Lo and Jianshu Weng. “Mining interesting link
formation rules in social networks.” CIKM '10: Proceedings of the 19th ACM international
conference on Information and knowledge management. October 2010.
Privacy
30. Asuncion, Arthur U. & Michael T. Goodrich. "Turning Privacy Leaks into Floods:
Surreptitious Discovery of Social Network Friendships and Other Sensitive Binary Attribute
Vectors." WPES '10: Proceedings of the 9th Annual ACM Workshop on Privacy in the
Electronic Society. October 2010.
31. Freni, Dario, Carmen Ruiz Vicente, Sergio Mascetti, Claudio Bettini & Christian S. Jensen.
"Preserving Location and Absence Privacy in Geo-Social Networks." CIKM '10: Proceedings
of the 19th ACM International Conference on Information and Knowledge Management.
October 2010.
32. Zhou, Bin and Jian Pei. “Preserving Privacy in Social Networks Against Neighborhood
Attacks” ICDE '08: Proceedings of the 2008 IEEE 24th International Conference on Data
Engineering. April 2008.
Individual User Behavior Patterns
33. Bhavanani, Suresh K. "Domain-Specific Search Strategies for the Effective Retrieval of
Healthcare and Shopping Information." CHI. pp. 610 - 611. 2002.
34. Kalbach, James. "Designing for Information Foragers: A Behavioral Model for Information
Seeking on the World Wide Web." Internet Technical Group,3.3, Dec. 2000.
35. White, Ryen W. & Steven M. Drucker. "Investigating Behavioral Variability in Web
Search." International World Wide Web Conference. pp. 21 - 30. 2007.
36. Widen-Wulff, Gunilla, Stefan Ek, Mariam Ginman, Reija Perttila, Pia Sodergard & AnnaKarin Totterman. "Information Behavior Meets Social Capital: A Conceptual Model."
Journal
of
Information
Science,
Volume
34
Issue
3.
June
2008.
Artificial Intelligence
37. Cohen, Paul R. Empirical Methods for Artificial Intelligence. (MIT Press, 1995. ISBN 10:
0262032252)
38. Denker, Manfred, Wojbor A. Woyczynski and Bernard Ycart. Introductory Statistics and
Random Phenomena: Uncertainty, Complexity and Chaotic Behavior in Engineering and
7
Science. Statistics for Industry and Technology, Nov 1, 1998. ISBN: 0817640312
39. Dressier, Fuchs, Truchat, Yao, Lu, and Marquardt. “Profile-Matching Techniques for OnDemand Software Management in Sensor Networks.” EURASIP Journal on Wireless
Communication and Networking. Vol. 2007.
40. Li, Yung-Ming and Han-Wen Hsiao. “Recommender Service for Social Network based
Applications.” ICEC '09: Proceedings of the 11th International Conference on Electronic
Commerce. August 2009.
41. Lu, Elchstaedt, and Ford. “Efficient Profile Matching for Large Scale Webcasting.”
Proceedings of the Seventh International World Wide Web Conference. April 1998.
42. Ma, Hao, Dengyong Zhou, Chao Liu and Michael R. Lyu, Irwin King. “Recommender
systems with social regularization.” WSDM '11: Proceedings of the fourth ACM
international conference on Web search and data mining. February 2011.
Distributed Computing
43. Coope, Trueulli, et al. “The Challenge of Designing Scientific Discovery Games.”
Foundations of Digital Games Conference. 2010.
44. Goldsmith and Owen. “The Search for Life in the Universe.” University Science Books. 3rd
ed. 2001.
45. Lang and Armitage. “An Ns2 Model for the Xbox System Link Gane Halo.” University
Science Books. ATNAAC 2003. Melbourne, Australia.
Web Science, Social Networking
46. Abraham, Ajith, Aboul-Ella Hassanien and Vaclav Snáel. Computational Social Network
Analysis: Trends, Tools and Research Advances (Computer Communications and Networks,
Springer-Verlag London, Dec 21, 2009. ISBN: 9781848822283)
47. Ang, Chee Siang and Panayiotis Zaphiris. “Simulating Social Networks of Online
Communities: Simulation as a Method for Sociability Design.” INTERACT '09: Proceedings
of the 12th IFIP TC 13 International Conference on Human-Computer Interaction: Part II.
August 2009.
48. Asuncion, Arthur U. and Michael T. Goodrich. “Turning privacy leaks into floods:
surreptitious discovery of social network friendships and other sensitive binary attribute
vectors.” WPES '10: Proceedings of the 9th annual ACM workshop on Privacy in the
electronic society. October 2010.
49. Bakshy, Eytan, Brian Karrer and Lada A. Adamic. “Social influence and the diffusion of
user-created content.” EC '09: Proceedings of the 10th ACM conference on Electronic
commerce. July 2009.
50. Berners-Lee, Tim, Wendy Hall, James A. Hendler. A Framework for Web Science
(Foundations and Trends(R) in Web Science S.). Now Publishing, 2006. ISBN: 1-93301933-6)
51. Burke, Moira, Cameron Marlow and Thomas Lento. “Social network activity and social
well-being.” CHI '10: Proceedings of the 28th international conference on Human factors in
computing systems. April 2010.
52. Cheung, Christy M. K. and Matthew K. O. Lee. “A theoretical model of intentional social
action in online social networks.” Decision Support Systems, Volume 49 Issue 1. April 2010.
53. Daradoumis, Thanasis, Alejandra Martínez-Monés and Fatos Xhafa. “A layered framework
8
for evaluating on-line collaborative learning interactions.” International Journal of HumanComputer Studies , Volume 64 Issue 7. July 2006.
54. De Choudhury, Munmun, Winter A. Mason, Jake M. Hofman and Duncan J. Watts.
“Inferring relevant social networks from interpersonal communication.” WWW '10:
Proceedings of the 19th international conference on World Wide Web. April 2010.
55. Easley, David and Jon Kleinberg. Networks, Crowds, and Markets: Reasoning About a
Highly Connected World. (Cambridge University Press, July 19, 2010. ISBN: 0521195330)
56. Eubank, Stephen, V. S. Anil Kumar, Madhav V. Marathe, Aravind Srinivasan and Nan
Wang. “Structural and algorithmic aspects of massive social networks.” SODA '04:
Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms. January
2004.
57. Freni, Dario, Carmen Ruiz Vicente, Sergio Mascetti, Claudio Bettini and Christian S. Jensen.
“Preserving location and absence privacy in geo-social networks” CIKM '10: Proceedings of
the 19th ACM international conference on Information and knowledge management. October
2010.
58. Fushimi, Takayasu, Takashi Kawazoe, Kazumi Saito, Masahiro Kimura and Hiroshi Motoda.
“What Does an Information Diffusion Model Tell about Social Network Structure?”
Knowledge Acquisition: Approaches, Algorithms and Applications. May 2009.
59. Groh, Georg and Christian Ehmig. “Recommendations in taste related domains: collaborative
filtering vs. social filtering.” GROUP '07: Proceedings of the 2007 international ACM
conference on Conference on supporting group work. November 2007.
60. Huang, Jian, Ziming Zhuang, Jia Li and C. Lee Giles. “Collaboration over time:
characterizing and modeling network evolution.” WSDM '08: Proceedings of the
international conference on Web search and web data mining. February 2008.
61. Kimmerle, Joachim, Johannes Moskaliuk and Ulrike Cress. “Learning and knowledge
building with social software.” CSCL'09: Proceedings of the 9th international conference on
Computer supported collaborative learning - Volume 1, Volume 1. June 2009.
62. Kleinberg, Jon. “Social networks, incentives, and search” SIGIR '06: Proceedings of the 29th
annual international ACM SIGIR conference on Research and development in information
retrieval. August 2006.
63. Kuan, Huei-Huang and Gee-Woo Bock. “Trust transference in brick and click retailers: An
investigation of the before-online-visit phase.” Information and Management, Volume 44
Issue 2. March 2007.
64. Kuhlman, Chris J., V. S. Anil Kumar, Madhav V. Marathe, S. S. Ravi and Daniel J.
Rosenkrantz. Finding critical nodes for inhibiting diffusion of complex contagions in social
networks.” ECML PKDD'10: Proceedings of the 2010 European conference on Machine
learning and knowledge discovery in databases: Part II. September 2010.
65. Lin, Chieh-Peng. “Assessing the mediating role of online social capital between social
support and instant messaging usage.” Electronic Commerce Research and Applications,
Volume 10 Issue 1. January 2011.
66. Lin, Kuan-Yu and Hsi-Peng Lu. “Why people use social networking sites: An empirical
study integrating network externalities and motivation theory.” Computers in Human
Behavior, Volume 27 Issue 3. May 2011.
67. Magnani, Matteo, Danilo Montesi and Luca Rossi. “Information Propagation Analysis in a
Social Network Site.” ASONAM '10: Proceedings of the 2010 International Conference on
9
Advances in Social Networks Analysis and Mining. August 2010.
68. Malinka, Kamil and Jiri Schafer. “Development of Social Networks in Email
Communication.” ICIMP '09: Proceedings of the 2009 Fourth International Conference on
Internet Monitoring and Protection. May 2009.
69. Maserrat, Hossein and Jian Pei. “Neighbor query friendly compression of social networks.”
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge
discovery and data mining. July 2010.
70. Mayer, Adalbert. “Online social networks in economics.” Decision Support Systems,
Volume 47 Issue 3. June 2009.
71. Mislove, Alan, Hema Swetha Koppula, Krishna P. Gummadi, Peter Druschel and Bobby
Bhattacharjee. “Growth of the flickr social network.” WOSP '08: Proceedings of the first
workshop on Online social networks. August 2008.
72. Moturu, Sai T., Jian Yang and Huan Liu. “Quantifying Utility and Trustworthiness for
Advice Shared on Online Social Media” CSE '09: Proceedings of the 2009 International
Conference on Computational Science and Engineering - Volume 04, August 2009.
73. Wang, Yu, Gao Cong, Guojie Song and Kunqing Xie. “Community-based greedy algorithm
for mining top-K influential nodes in mobile social networks.” KDD '10: Proceedings of the
16th ACM SIGKDD international conference on Knowledge discovery and data mining. July
2010.
74. Widén-Wulff, Gunilla, Stefan Ek, Mariam Ginman, Reija Perttilä, Pia Södergård and AnnaKarin Tötterman. “Information behaviour meets social capital: a conceptual model.” Journal
of Information Science , Volume 34 Issue 3. June 2008.
75. White, Su, et al., “Negotiating the Web Science Curriculum through Shared Educational
Artefacts,” Proceedings of the 3rd International Conference on Web Science, June 4-17,
2011.
76. Xiaodong, Zhao, Guo Weiwei and Mark Greeven. “An Empirical Study on the Relationship
Between Entrepreneur's Social Network and Entrepreneurial Performance: The Case of the
Chinese IT Industry”. IFITA '10: Proceedings of the 2010 International Forum on
Information Technology and Applications - Volume 03, Volume 03. July 2010.
77. Ye, Qi, Bin Wu, Yuan Gao and Bai Wang. “Empirical Analysis and Multiple Level Views in
Massive Social Networks.” WI-IAT '10: Proceedings of the 2010 IEEE/WIC/ACM
International Conference on Web Intelligence and Intelligent Agent Technology - Volume
01, Volume 01. August 2010.
78. Yeung, Ching-man Au and Tomoharu Iwata. “Capturing implicit user influence in online
social sharing.” HT '10: Proceedings of the 21st ACM conference on Hypertext and
hypermedia. June 2010.
10
Sample Laboratory Experiment #1: Search Engines
Hypothesis
• Differences in search algorithms cause large differences in search results.
Learning Goals
We will verify the validity of this hypothesis in this experiment through:
• Using different search engines for different types of searches;
• Ranking of the “suitability” of three major search engines.
• Explore how search engines generate different results based on current events and past
events.
Resources
For this experiment, you will need access to the Internet and use a web browser and a
spreadsheet application.
Procedure
In this experiment you will be ranking the “suitability” of the three major search engines to
different types of searches. Two types of queries are to be explored – Current Events (things in
the news or recent within the past 48 hours) and Past Events (things from the past – more than
one week old). You are to present three different queries (search phrases) to the search engines,
and then systematically investigate the suitability of the top five links presented by the search
engines. You should produce a one page written report based on the analysis of the collected data
linking the “suitability” of the Search Engines to searches on Current events and Past Events or
lack thereof.
Here are the steps for the experiment: First, we will conduct the experiment using current events.
You will repeat this process for Past Events.
!" Identify three search phrases on current events.
#" Identify with three search engines
$" Search these three phrases on three major search engines and write down the top two links
presented by the search engines X, Y, Z for each phrase. For example, if your search phrase
is “Japanese Tsunami”, you will select the top two links in the result coming from each of X,
Y, Z search engines.
%" Next calculate how related the webpage is to the search phrase by performing a word count.
You may do this by counting the number of occurrences of the search phrase as a whole
within the webpage. There should be six of these counts for each search engine – 2 links and
3 search phrases. Total the six numbers to get a value for each search engine.
&" Using alexa.com or a suitable utility, obtain the number of visits to each of the web pages
over a fixed time period like one week. Again, there are six of these figures. Total the six
numbers to get a value for each search engine.
'" Using a suitable utility, obtain the number links (reference count) to each of the web pages .
Again, there are six of these figures. Total the six numbers to get a value for each search
engine.
11
(" Using a suitable utility such as blogsearch.google.com, obtain the blog chatters related to the
web pages. There are six of these figures. Total the six numbers to get a value for each search
engine.
)" Using a spreadsheet, enter the three search engines in one row or column and the
corresponding numbers for word count, visits, reference count and blog chatter. Produce a
bar chart with these figures for each of the four categories.
Repeat the above steps with past events.
Analysis/Discussion
You should produce a one page written report based on the analysis of the collected data linking
the “suitability” of the Search Engines to searches on current events and past events or lack
thereof. You are required to submit a word document with the information for the above steps,
the spreadsheet tabular data and the bar charts.
Grading
• Steps 1-7: 10% each (70% total)
• Summary Report: 30%
12
Sample Laboratory Experiment #2: Computer Connectivity and Routing
Hypothesis
End-to-end delay associated with the delivery of a packet on the Internet is a function of variable
conditions including the physical distance, the size of the packet, the number of hops, available
bandwidth, and the amount of traffic (time of day) between the two end nodes.
Learning Goals
We will verify the validity of this hypothesis in this experiment through:
• Learning the definition of common computer networking terminologies and giving examples;
• Using network utilities to learn about how a machine is connected to the rest of the Internet;
and
• Understanding the effects of physical distance, bandwidth, hop distance, etc. to the end-toend delay associated with the delivery of a packet on the Internet.
Resources
For this experiment, you will need access to the Internet and the following network utilities:
• Ping
• DNS Lookup
• Traceroute
In addition, you will also need a web browser.
Procedure
1. Terminologies. Provide a short description of the following terms.
a. Host name. Give an example.
b. IP address. Give an example.
c. Network interface
d. Routing table
e. Packet
f. DNS
g. TLD. Give an example.
2. Using Network Utilities. All machines that is connected to the Internet is assigned at least one
IP address. Gather the network information for your machine and how it is connected to the
Internet using the specified network utility. Answer the following questions.
a. Network Information.
1. What is the Hardware Address of your machine’s en0 network interface?
2. What is the IP address assigned to your machine?
b. Lookup. Lookup the corresponding Internet (IP) Address for the following host names.
Find the similar data for 5 additional host names of your choice.
•
Network Address
13
•
•
•
•
my.emich.edu
www.citibank.com
Google.com
www.cnnic.net.cn
c. Ping. Ping the same 10 host names in (2b) by sending 10 pings each. Do the same for the
5 host names you added in Part b above. What is the percentage of packet loss? What are
the minimum, average, maximum, and standard deviation of the round-trip times (RTT)
in msec?
d. Traceroute. How many hops does it take to get to the 10 addresses defined in (2b) from
your machine?
e. Time-of-day. Repeat parts (c) and (d) above during a different time of day.
f. Geolocation. Using a web search engine, find the physical location of the network
addresses defined in (2b). Next, using a mapping web site (i.e., Google Maps, MapQuest,
etc.), look up the physical distance between your current location and the given network
addresses.
Analysis/Discussion
Based on the data you collected, discuss the relationship to the end-to-end delay associated with
the delivery of a packet on the Internet as a function of the physical distance, the size of the
packet, the number of hops, available bandwidth, and the amount of traffic (time of day) between
the two end nodes.
Grading
• Terminologies: 20%
• Procedure: 40%
• Analysis/Discussion: 40%
14
Sample Laboratory Experiment #3: Social User Behavior
Learning Goals
Here you will study the kinds of social, web-based interactions classified by Malone and
Crumlish in their work on the design of social interfaces. Context A is within the
“Collaboration” sphere of the “Activities” space, specifically collaborative editing endeavors
such as for a Wiki. Context B is within the “Community Management” sphere of the
“Communities” space, specifically norms and group moderation. Somewhat similar tasks are
carried out across Contexts A and B. Our hypothesis is that despite these surface similarities,
there are reliable and discernable differences in the following measures: Number of times a
person posts an exclusionary message, pushing someone else out of the group (e.g., “Get lost,
troll.”); percentage of posts that contribute directly to the task at hand; and percentage of posts
that are affirming or reinforcing of others’ efforts.
Resources
Web-connected computer, accounts and identities on (A) a Wiki (Wikipedia is fine), and (B) a
discussion web site, preferably on a particular topic such as a scientific, political, health, sports,
or business issue.
Procedures
Read users’ posts collected over a period of eight hours in each of Contexts A and B. The time
periods compared have to be equal. You can record posts via copying and pasting, printing to
PDF, or some other method. Categorize each post as (1) task-oriented (in a discussion, this
would be providing objective information or links), (2) exclusionary, (3) affirming, or (4) other.
The last category comprises many possible kinds of postings and it is important to count them in
order to compute percentages.
Analysis and Presentation
Compute the relevant percentages and count the numbers of exclusionary posts. Present these
using a bar chart or similar in a way that shows any apparent differences between Contexts A
and B. Perform t tests on the pairs of proportions and counts.
Discussion Questions
1. Were there statistically significant differences between the two contexts on any measures?
2. If there are such, what do you think are the reasons?
3. What other comparisons could be made between these contexts of use (refer to Malone and
Crumlish’s work).
4. How do the processes you observed in these posts shape the nature of user communities over
time?
15
Grading
• Understanding of interaction types: 20%
• Completeness of notes and procedures: 20%
• Clarity of graphical presentation: 20%
• Detail and insight in discussion answers: 20%
• Overall quality and style: 20%
Notes for faculty
There are many other kinds of social interaction that could be studied. It would also be possible
to devise a good lab where the nature of interactions over time is investigated, looking at the
effects of particular kinds of postings. Besides relatively slowly-changing discussions such as
those used above, one could feature Twitter feeds or other more stream-oriented technologies.
16
Sample Laboratory Experiment #4: Data of the Web System
The Web is a system originally designed to share information. The shared information is
distributed, of varying quality, in varying formats, from varying sources. Understanding the
nature of the data is fundamental to developing an understanding the web as a system for
scientific study. Unlike chemistry, where the building blocks were painfully discovered, the
fundamental building blocks of the web are engineered. However, like chemistry, understanding
the building blocks of the web system facilitate discovery of the principles of emergent behavior
that is built on those building blocks.
Learning Goals
• Understand the nature of the data in Web 1.0: HTML
• The interaction between data elements: links
• Understand the use of the data through experimentation on web crawlers
• Understand the concepts of precision and coverage (i.e., the ROC matrix)
Resources
For students
• Computer with Internet connection
For laboratory
• Two to four web crawlers that vary in their algorithms for crawling and for indexing. These
crawlers should be limitable to specified servers. The crawlers may be initialized with a set
of desired keywords.
• Four related data sets: Given a set of predefined HTML pages, each page has four versions:
! Links are present:
! one does not use <META> tag to aid the web crawlers
! the second does use <META> tag to identify keywords for the crawlers
! Links are removed
! one does not use <META> tag
! the second does use <META> tag
! The data sets must be scored apriori by a human in order for the student to compare
results against known correct results.
Procedure
Run the web crawlers separately over the complete set of data. Collect the following
measurements:
1. For each page, was it indexed correctly according to keyword?
2. For each web crawler, how long did it take to index the complete set of data?
Analysis and Presentation
Quality of indexing: Using the data collected above, analyze the results for precision and
coverage in comparison to a previously human scored index. For a crawler, what percentage of
its misses should have been hits (a low rate indicates good coverage)? what percentage of its hits
17
were actually misses (a high rate indicates poor precision)?
Time of indexing: Using the data collected above, analyze the time to completion for each web
crawler against each data set.
Discussion Questions
1. What is the nature of the data that will lead to good coverage and precision by a web
crawler?
2. How can one tune precision versus coverage?
3. What is the nature of the data that will lead to faster web crawler performance?
4. What happens to the quality of the results when the data set is modified during a crawl?
Grading
• HTML and links: 10%
• Web crawler: 10%
• Performance measures (speed, precision, coverage): 40%
• Graphical presentation of data: 20%
• Discussion questions: 20%
18
Sample Laboratory Experiment #5: Emergent Collectives (Recommender Systems)
From the simplest units on the Web, i.e., any object an HTML anchor can reference, combined
with a set of user actions, clicks, on those units, a startling result manifests. The users form into
groups, intersecting subsets, based on shared clicking (preference) behavior. The objects also
form into groups of similar items based on the user behavior. This emergent structure, based on
user preferences, has developed into the area of recommender systems. The behavior of a
recommender system can be observed by the casual observer through Netflix or Amazon
recommendations.
This laboratory will investigate the nature of the formation of user groups and object groups.
Learning Goals
• Understand the formation of groups based on user behavior.
• Understand the measurements of groups (cohesiveness, durability)
• Understand the effects changing input has on emergent groups.
• Understand recommender systems.
Resources
• Computer with data analysis software such as MatLab, R, Sage.
• Toy recommender system, with items, such as movies, and customers. The user has the
ability to enter rankings of items for customers.
Procedures
1. Given a set of n (n <= 10) items and 2 * m users (m = the number of students in the class),
each user will rate every item on 5 star scale (as for Netflix) as themselves and again as a
person well known to the student 20 or more years older.
2. Analyze the data to discover disjoint user groups. For each user group, make 1 – 3
recommendations.
3. Now add another n items and rank them.
4. Analyze the new data and the conglomerate data to identify new and modified groups. Again,
for each user group, make 1 – 3 recommendations.
5. Analyze the recommendations made based on groups from the perspective of the individual
users. How high does each user actually rank a recommendation?
Analysis and Presentation
The cohesiveness of a group is measured according to the distance between element and the
median position of the elements in a k-dimensional space. (k is the number of characteristics
being considered by the experiment).
Calculate the cohesiveness of the groups. Identify groups that have a high or increasing level of
cohesiveness with new preferences. Identify groups with low or decreasing cohesiveness.
From the individual user perspective, how attached is he to a particular group? Does the user
19
attachment reflect group cohesiveness?
Discussion Questions
1. How do these emergent groups compare to groups in other scientific systems: e.g., tribal
units, species, collection of memes.
2. What is the nature of the objects that would lead to stable groups? Fashion-driven, societal
norms (e.g., authoritarian versus democratic politics), ...
Grading
• Data collection: 40%
• Graphical presentation of data: 40%
• Discussion questions: 20%
20
Sample Laboratory Exercise #6: Distributed Computing
Learning Goals
In this lab, you will explore the concept of distributed computing and investigate an ongoing DC
project. You will be able to understand and document the characteristics of a distributed system
(structural research) and analyze and report on an on-going DC project (naturalistic
observation).
Resources
A computer with Internet connection.
Procedures
1. Using the textbook and/or an online resource, find the following information about a DC
system:
a. How are the computers (nodes) connected to each other?
b. Do these computers communicate with each other? What does the dedicated server
do?
c. How do these computers help in solving a problem?
d. Is the combined computational power better than a single computer? Supercomputer?
How so?
e. What is BOINC?
2. Select one on-going DC project, such as, SETI@Home and Folding@Home, and find the
following information:
a. Number and type of active CPUs contributing to the project
b. Server status for the past year
c. Top participants and their credits
d. BOINC projects with most participants
Analysis and Presentation
Draw a distributed computing environment and label its parts. Include the following: nodes,
dedicated server, network, message passing, ports, and BOINC software locations.
Using charts, visually display the information you collected in step 2 of the procedure. One
should be able to answer some of the discussions questions, given below, using these displays.
Discussion Questions
1. What is the main purpose of distributed computing?
2. How is it different from supercomputing, parallel computing, and cloud computing?
3. Provide answers to the following questions, using the charts:
a. What is the type of CPU that is contributing most to the project?
b. How many servers have been rejecting connections in the month of March?
c. The maximum credit received by an out-of-country participant.
d. Any 5 BOINC projects with more than a million users.
21
Grading
• DC drawing: 15%
• Charts: 30%
• Answers to discussion questions: 40%
• Overall quality and style: 15%
22
Download