自動化系統評估Evaluation of Automation Systems

advertisement
自動化系統評估
Evaluation of Automation Systems
張裕幸
F.W. Lancaster and Beth Sandore, Technology and Management in
Library and Information serverices:Chapter14 Evaluation of
automated Systems,1997,196-225.
• Evaluation of the performance of an automated system
can provide several useful types of management
information on (1) whether new or updated systems
meet contract requirements; (2) whether the system is
living up to the performance and output standards of its
user community; (3) the point at which a new system or
system refinements are needed; (4) possible future
resource consumption.
• 評估自動化系統的效益可以由下列幾種有用的管理資訊予
以獲得:(1)新系統或更新系統是否符合合約需求; (2)系
統產出或效益是否有達到使用者(群)的要求標準; (3) 明
確指出新系統或系統強化的需求; (4) 未來可能的資源消
耗。1
• -------------------------------------• 1.第四項資訊即考慮到未來潛藏性的成本支出,與TCO的
觀念相符。
• It is obvious that a computer system can
be evaluated according to different types
of criteria – ease of use, cost, reliability ,
integratibility , and so on.
• 電腦系統的評估可以依據不同的標準,如
「容易使用」、「成本」、「穩定性」以
及「整合性」等等。
• In her survey of 54 major research libraries in
North America, Johnson (1991) discovered that
ease of use by patrons was a major
consideration in the selection of a new system. –
above cost and, perhaps surprisingly, ease of
use by staff. Her found that these same libraries
considered improvement of user services as the
major objective of automation and improvement
service to users as the major accomplishment of
automation.
•經過評估北美54個主要的圖書館,Johnson(1991)
發現使用者感覺「容易使用」是選擇系統的主要
考量。2而她亦發現改善使用者服務是自動化系統
的主要目標,亦是自動化的使命。
• Peters(1988) identifies three types of systems
evaluation: (1)functional – to determine whether
a system’s features meet the library’s needs;
(2)economic – to determine the affordability of a
system; and (3) performance – to reveal whether
the system capacity can meet present or
anticipated future demands.
•Peters定義系統評估三個型式:功能性—決定系
統特性是否符合圖書館需求;經濟—決定系統的
支出(時間與金錢等);效益—系統是否能夠符合
現今或未來的需求。
Value
1
2
3
Ease of use by patrons (顧客感覺好用)
77.8
14.8
7.4
Availability of application modules and subsystems (應用模組或子系統之可用性—符合使用需求)
77.8
18.5
7.4
Completeness of modules and subsystems (模組或子系統完整性)
68.5
22.2
9.3
Cost of system (系統成本)
68.5
29.6
1.9
Cost of hardware (硬體成本)
61.1
33.3
5.6
Need for local programming stall (MIS人員的需求)
59.3
29.6
11.1
Service reputation of vendor (軟體廠商的保證-e.g.售後服務)
53.7
37.0
9.3
Easy of use by staff (職員或幕僚感覺好用)
48.1
51.9
0.0
Comparable installed sited (軟體的客戶參考)
44.5
40.7
14.8
Previous experience with vendor(軟體廠商的導入經驗)
25.9
29.6
44.5
Training and documentation provided (教育訓練及文件提供)
22.2
66.7
11.1
5.6
25.9
68.5
Criteria
Bias against vendor (對軟體廠商的徧見)
Key:
1.= Seriously considered (審慎考慮)
2.= Considered to some extent (僅考慮某些範圍)
3.= Not considered at all (從不考慮)
• There are obviously many possible ways in which
approaches to the evaluation of automated systems can
be categorized. For the purposes of this chapter, two
major approaches are identified:
• Evaluation without user involvement or with less than full
user involvement.(user free)
• Evaluation with full user involvement. (user involved)
• 評估自動化系統有許多方法,但可歸納成兩類:1.没有使
用者參與的評估方法;2.使用者完全參與的評估方法。3
• ----------------------------------• Thinking—TAM模型屬於上述兩種評估方法中第2種型式。
• User-Free Evaluation: This category of
evaluation focuses on system features
rather than on how these are exploited by
a particular group of users.
• 無使用者參與評估:這類型的評估著重於
系統特性而不是特定使用者的試用經驗。
這類評估可用於系統的選擇、系統的接受
性評估以及系統強化或替換的決策評估。
• One useful tool that can be used in the
selection of systems is a checklist to
determine the features present in a
particular system or, more particularly, to
compare the characteristics of two or more
systems.
•這類的評估通常會使用Checklist 工具,
以呈現特定系統的特性或比較兩個系統之
間的特色。
• A point value may be assigned to each feature,
and a differential weighting scheme may be
established to place emphasis on features that
are considered more important than others. In
other cases, features are assigned an equal
rating of 1 or 0. system scores can be derived
from the grids, with subtotals to indicate system
strengths in particular areas, and total scores to
indicate overall performance.
•在Checklist中, 可針對不同的特色指定權值以強
調重要的系統特性。系統評分可以由垂直加總導
出,子項加總分數為該系統某特定區項的系統強
度,整體加總可以顯示出整體的效益。
Questions
A
B
C
D
O
B
I
S
D
E
Geac
1.Is there adequate logon instruction (i.e. explain which terminal types are
supported)

2.Are the contents and coverage of the OPAC ckearkt exokaubed?




3.Are the key equivalencies explained for remote user’s keyboard?

N
A
N
A
N
A
N
A










4.Is there adequated logoff instruction?
5.Is the screen display always clean? (i.e., no garbage characters)

6. (a) Is remote access unrestricted in terms of time of day?
F

G
H
I
J
K
L
NOTIS
P
A
L
S
D
R
A
H
O
M
e
H
O
M
e








N
A

N
A

N
A

























(b) Does the system tell the user if there is a time limit to remote sessions?
(c) Does the system give a warning message of automatic logoff if there is no user
input?
7.Does the remote user have access to the some OPAC as those who use dedicated














terminals in the library?
8.Does the sytem indicate where the remote user can get additional help?

Score(Maximum 10)
6
Note: “NA” means “not applicable”

5
4
7

5
5
6
6
6
8

8
7
• The checklist method of evaluation is useful for
several reasons. In the case of a single system
review, it helps one to arrive at a list of desirable
features, and to identify the strengths and
weaknesses of a particular system. In the case
of a multiple system review, a comparative
checklist can help to verify the existence of
features across systems and thus to identify
comparative strengths and weaknesses.
•在進行系統評估時Checklist是相當有用的工具,
對單一系統評估,它可以協助整理需求清單,同
時指出特定系統的優缺與弱點。對多個系統的比
較上,它協助定義不同系統間的特色差異,並找
出強勢與弱點。
• The use of a checklist ensures that the
same questions about system features are
posed consistently across systems.
•Checklist的使用確保問題在不同系統間可
以在一致性的標準下進行比較。
• Cherry et al.(1994) employ4ed a checklist to survey
features in the OPACs of twelve Canadian academic
libraries. Data on each system were collected twice, by
two different researchers, and the two datasets were
checked a third time against the systems to resolve any
disagreements. One hundred seventy features were
included in the checklist, grouped into ten functional
categories:1.Database characteristics; 2.Operational
control; 3.Searching; 4.Subject search aids; 5.Access
pints; 6.Screen display; 7.Output control; 8. Commands;
9.User assistance; 10. OPAC usability via remote access.
• Cherry等(1994)使用checklist對加拿大十二所學院圖書
館進行評估,首先他對兩個研究群組進行系統的特性資料
收集,針對此兩個資料集再進行第三次查核以除去誤差以
求數據之公正。在他的研究中checklist總共收集了120因
素,並予以歸類聚集成十大類別之中。此十大類別為:1.
資料庫特性; 2.操作性控制; 3.搜尋; 4.主題搜尋輔助;
5.存取點; 6.螢幕畫面顯示; 7.輸出控制; 8.命令輸入;
9.使用者輔助; 10. 經由OPAC遠端存取的使用率。
• Acceptance testing or benchmarking is a
process often used by libraries to verify that the
new or upgraded system meets the contract
requirements. Often the conditions of
acceptance in a contract indicate clearly what
type of performance is expected, and the
acceptable level of performance, to determine
whether a system works in the manner agreed
upon in the contract.
•接受度測試及標竿法經常用於檢驗新系統或系統
更新是否符合合約的需求。合約接受性(驗收標準)
清楚地指出何種效益必須達到要求,以及效益水
準為何、檢驗系統是否在合約所認定的規範下執
行。
• At times, public or staff users identify problems or make
suggestions for system changes designed to refine its
operation or its interaction with users. The feedback for
making these changes can come from word of mouth or
from the periodic review of performance logs generated
by the system.
• 職員或使用者所找出的問題或建議可於系統功能或設計上
予以進行修正或加強,而這些回饋意見的收集可以透過口
頭、文字或系統記錄分析獲得。
• Stress tests are commonly used to test implementations
of new features.
• 著重測試(具時迫性測試)是為了對新上線的系統特性予以
測試。
• Capacity planning is another important element
in overall evaluation. By tracking the size of the
database, and estimating its growth rate,
projections can be made about when to increase
capacity, and whether this increase in size will
degrade or otherwise affect response time and
other performance factors. Precise capacity
planning is difficult because it involves projection
and prediction based on numerous complex
performance factors.
• 系統容量規劃亦是整體性評估的重要項目,資料
庫容量是否足夠符合未來成長需求。資料存取算
是否會影響系統回應時間,而確地規劃是相當困
難,因為它包含大量效率相關因素的預測與其相
互影響的評估。
• User-involved Evaluation: For over twenty years,
a growing body of research based on
information science and cognitive psychology
has been performed to gain a better
understanding of how users interact with
systems, and how the results of that interaction
can be evaluated. One practical goal of this type
of research is to collect and analyze information
that can be fed back into better system design.
• 使用者參與的評估:資訊科學與認知心理學人機
互動的研究於近廿年來有長足的進步,甚至對於
互動結果亦可進一步進行評估。這類研究的目標
在於收集並分析回饋資訊以求更好的系統設計。
• The interaction between the user and the system can be
the subject of study for a number of purposes. The
studies discussed in this chapter are carried out to learn
more about how a system is used and to improve its
performance. Many possible methods are applicable.
Unobtrusive measures gather data while library patrons
are actually using the system. User may or may not be
aware that their keystrokes, or other actions, are being
recorded or observed. The methods are unobtrusive in
the sense that users are not being asked any questions
and are not required to do anything they would not
otherwise be doing.
• 基於許多目的,使用者與系統的互動可以做為研究主題,
而本章主要是探討如何提昇效益。非強制性觀察法是以使
用者實際使用系統的過程收集資料,而使用者可能在不知
情的情況下鍵入動作,均被記錄下來並以觀察。這種觀察
法使用者不會被問及有關使用系統的任何問題,而這主要
的目的在於讓使用者處於自然的環境下,評估使用者的活
動。
• Obtrusive measures are used primarily to obtain
feedback on user preferences for various system
features and their opinions on system performance.
• 強迫性觀察的評量主要是可以獲得使用者徧好的回饋,以
及他們對系統效益的看法。諸如,以訪談方式或在研究者
督下進行系統測試。
• Data thus collected can be useful in revealing how
specific system features are exploited and in identifying
features that appear to be giving users significant
problems. At least three types of approach are applicable:
review of transaction logs, direct observation of users
operating at terminals, and video and/or audio taping of
user performance.
• 特定的系統特質可以依據使用者所提供的重大問題予以發
掘。至少有三種方式是可行的:直接檢視交易記錄檔、直
接觀察使用者在終端機操作的情形、或者以影音錄製使用
者使用系統績效的情形。
• Transaction log analysis (TLA) has been defined as the
“… studey of electronically recorded interactions
between online information retrieval systems and the
persons who search for the information found in those
systems.(Peters et al., 1993a)
• 交易記錄分析被定義研究 ”使用者存取系統與資訊檢索
間電子記錄的研究”。交易記錄分析在1970年代被視為分
析使用者與線上型錄(選單)間互動的工具。
• Many TLA studies gather information on how frequently
system features are used: choice of search type, use of
help screens, how many hits users are willing to review,
how often a search results in zero hits, the number and
type of error messages that users receive, and so on.
• 多數TLA研究在於收集系統特性、使用的頻率、搜尋型態
的選擇、help螢幕以及功能的使用、使用者重覆點選的
hits數、使用者看到錯誤訊息的數量。
• An annotated bibliography by Peters et al.(1993b) and a
review article by Simpson (1989) serve as two excellent
sources of further information about TLA.
• 使用TLA的著名文獻有Peters et al.(1993b) 以及
Simpson(1989)是兩篇相當優秀參考來源,可以做為未來
的TLA研究。
• Despite all of its potential benefits, transaction log
analysis does have limitations. In many systems with
transaction log monitoring facilities, it is either difficult or
impossible to delineate individual user searching
sessions.
• 除了上述所提及的優點外, TLA方法亦有其限制,在許多
系統中TLA要找到別使用者的搜尋Session是相當因難,甚
至是不可能的。另一個限制是在跨系統的比較上,TLA並
不適合且無法顯示相同的特性。
• Another problem is that of cost. A
comprehensive monitoring module can
add a significant overhead to the cost of
operating the system.
• 另一個問題是成本考量,持續的觀察對系
統的運作是不個明顯的負擔。甚至對圖書
館人員及管理者而言缺乏時間去分析這項
的訊息。
• Transaction log analysis collects data about system use
in the aggregate and deals on with the quantitativewhich commands are used how often, which heading are
consulted, how much time is spent per session, and so
on. The most obvious example is the monitoring and
analysis of use of a help command. Knowing what types
of help are requested by users, especially in the case of
a new system or one that has recently added new
features, can be of great value in identifying problem
areas that may not have been anticipated in the system
design but may in fact, be rather easy to correct.
• TLA 在系統使用分析上以量為分析對象包含對使用頻率、
上線的時間以及對系統項目的協助查詢等,在量的收集包
含總數及其分配。最明顯的例子是觀察並分析help指念的
使用狀況。 如果知道何種型態的help command是使用者
最常發送的動作,則能清楚對新系統或新增功能找出其問
題所在,而這些問題剛好是在系統設計時未曾被考量但卻
需予以修正。
• Although it is rarely acknowledge, direct observation is
perhaps one of the most commonly employed
techniques for collecting information about online system
users. Critics often suggest that observation is an
unscientific way of gathering only the information needed
to support one’s own views. The technique need not be
flawed; it is the degree of consistency in what is
observed, and at what intervals it is observed, that
determines the reliability of the data collected.
• 雖然較少受到研究者的認同,直接觀察法可能是收集線上
系統使用者資訊中最普遍的技術。評論家認為這類方式是
不符科學方式(僅收集符合研究者論點的資訊)。然而這類
技術並不完全有缺陷,重要的是所觀察事件現象的一致性
程度如何,且觀察時段區間如何,而這些均是決定所收集
到資訊的可靠性。
• It is important to employ valid sampling techniques in
conjunction with observation in order to obtain reliable
data on which management decisions can be based. For
examples, if one wants to know how many times users
have to wait in line to use terminals in the reference
room, one obviously cannot rely solely on the
observations of a single librarian who only staffs the
reference desk fifteen hours per week, between 8 a.m.
and 5 p.m., Monday through Friday. Direct observation
can be useful, only its own or to supplement other
methods, when appropriate sampling methods are
employed and input is received from more than one
observer.
• 使用直接觀察法必須結合其他的抽樣技術以得到可靠性的
資料,這樣方能提供管理決策使用。例如,如果想要得知
使用者在閱覽室中需多少時間排隊方能使用終端機,若僅
依據某一圖書館人員在一週內上班時間中十五小時的觀察
根本不夠。直接觀察法必需有多人觀察收集資訊且結合其
他的抽樣等收集方法才能具備有用性。
• A number of studies have analyzed the results of video
and/or audio taping of the speech and actions of users
during search sessions. The technique has been used to
examine whether the cognitive, affective, or attitudinal
behavior of users affects their performance and the
outcomes of searching. Methods likes protocol analysis,
which employ a pre-determined framework for analyzing
user comments (asking a user to “think-aloud” while
searching, then recording the resulting behavior and
comments), can be used to classify and evaluate the
relative effect of user decision-making and behavior on
the success or failure of searching.
• 有些研究利用錄影帶或錄音帶的方式將使用者在搜尋的連
線過程中所做的動作及關鍵字(併字)予以記錄並分析。這
樣的技術已被使用為評估是否使用者的認知或態度會影響
搜尋結果其效率與否。上述方法類似protocol analysis
(要求使用者在搜尋時自言自語,同時記錄其行為及使用
評語),可以進一步對使用者的決策過程及搜尋行為的結
果予以評估。
• Survey questionnaires enable the collection of data
about user satisfaction with a system or specific aspects
of it, searching preferences and attitudes, demographic
data about users, and the level of skills or knowledge
that users possess.
• 從問卷亦可以收集到使用者對系統的滿意度及使用觀點,
如搜尋偏好及搜尋態度、使用者的人數統計以及使用者的
資訊素養等。
• The published literature includes several studies in which
online questionnaires were applied to record user
attitudes and preferences for system features. The
advantage of online questionnaires is the ability to collect
critical incident data about session immediately following
the session.
• 應用線上問卷可以記錄使用者對系統特性的態度及偏好研
究,同時可以對使用者在搜尋的連線狀態下收集到重要事
件的資料。
• In comparison with questionnaires and transaction log
analysis, interviews can provide a more intimate view of
the user’s perspective on the system under examination.
• 相較於問卷調查及線上交易記錄分析,訪談法可以提供使
用者對系統更詳細更深入的觀點。
• Another type of interview – the focus group interview –
may be conducted with a small group and one or more
interview, with video and/or audio taping, and/or
assistants transcribing notes and statements during the
group discussion. Focus group interviews are regularly
used in marketing research to gather information from a
particular, pre-selected group of users about products or
potential products.1
• 另一種訪談法是特定群組訪談,它可是和一個小群組或多
個小群組進行訪談(可以錄音或錄影、或者是以小組討論
記錄方式)。 特定群組的訪談通常使用於市場研究或收集
特定群組對產品及潛在產品的看法。
• 1:亦可以以線上群組聊天方式收集群組意見,進行網路虛
擬社群相關主題研究。
• Some online systems offer the option for users to send
unsolicited comments to librarians or system designers.
In some cases, system administrators post an e-mail
address to encourage users to report bugs or anomalies
they encounter while searching the system. Therefore,
the comment option is a valuable tool in problem
identification. Current systems commonly offer an option
that enables users to send mail message from within the
system to system administrators or other staff who work
closely with various aspects of the catalog.
• 某些線上系統提供使用者傳送自發性的建議給館員或系統
設計者。在某些狀況下,系統管理者會貼出Email位址以
鼓勵使用者能回報他們在搜尋資料過程中所遭遇的Bugs.
意見回饋有助於指出問題所在。現在大部分的系統均會提
供意見反應的功能給使用者傳送訊息給系統管理者或館員。
• Limitations of Evaluation
• Although the reasons for performance evaluation are
compelling, much of the work done in systems
evaluation within libraries is the exception rather than the
rule. This situation can be commonly observed for a
number of reasons. Sometimes system resource usage
is neither the concern nor the domain of the library, but
rather of the campus or city administrative computing
center. In situations where the management of the
computing facilities is separated from the management
of the library, it is more difficult to establish a cohesive
picture of the factors that affect the system’s
performance, much less to correct these situations when
performance problems arise.
• 雖然績效評估的原因是迫不得已,大多數圖書館的系統評
估均非日常例行性工作。這種情況可由一連串的原因觀察
得到。系統資源的使用率既不是圖書館專有領域亦非其所
關心範圍,當然亦不是校園行政事務或電算中心的職責。
當電腦效能的管理與圖書館管理分開時,要建立系統績效
的影響因素關連圖時是相關困難,更不用說當問題發生時
會去更正。
• Also, online systems typically generate hundreds
of statistical reports about the system functions
on a regular (usually monthly) basis. Often,
information about system resources usage
needs to be carefully analyzed and translated to
a different format in order to be usable for library
managers. System vendors have moved
increasing to report generation modules that can
be customized by libraries in order to avoid this
pitfall.
•線上系統可以產生上百份有關系統功能的統計報
表(周報表、月報表或季報表)。但報表上的系統
資源使用率資訊必須經過仔細的分析及轉換才能
提供圖書館管理者有用的資訊。
• The analysis of system performance data
requires both skill and a commitment to ongoing
analysis. Not all librarians feel they have
adequate training in the use quantitative or
qualitative analysis methods. Further, it is not
always clear where in the library organization
the responsibility ought to rest for ongoing
evaluation, beyond the annual budgeting
process for equipment, software, and online
contractual services.
• 系統效益分析需要技術及專心以赴地不斷地進行,
然而並不是所有的館員均認為他們適合做這樣量
化及質化研究的分析訓練。再者圖書館組織是否
應持續地進行評估,在政策面亦不明朗因為這牽
涉到年度預算、設備軟體及合約服務期限等問題。
• The published literature reveals that this
analysis is now being performed by many
different people – system, reference,
collection development, technical services,
and administrative librarians.
• 最近研究發表文獻指出這類的評估作業亦
有許多不同的人員在進行,這包括系統、
參考人員、資料發展分析、技術服務人員
及圖書館管理人員。
Download