Pathfinder Quick Guide

advertisement
Pathfinder Quick Guide
Pathfinder Quick Guide
(Ctrl-Click on a title will take you to that section)
Table of Contents
Pathfinder Quick Guide ............................................................................................................................ 1
Getting Started ...................................................................................................................................... 2
Advice on Selecting Terms ................................................................................................................... 3
How to Collect Proximity Data ............................................................................................................ 3
Pathfinder Control Center ..................................................................................................................... 5
Sample Data Sets .................................................................................................................................. 7
Data Coherence ..................................................................................................................................... 7
Data Correlations .................................................................................................................................. 7
Setting Network Generation Parameters .............................................................................................. 7
Comparing Networks ............................................................................................................................ 8
Network Layouts (Pictures) .................................................................................................................. 8
1
Pathfinder Quick Guide
Getting Started
The help available in the Pathfinder software is actually a Microsoft Word document (Pathfinder.doc)
that can be found in the directory, C:\pfdir, after the software has been installed. You may find it
useful to print the Pathfinder.doc document to facilitate ready access.
Before you can use the Pathfinder software, you must have one or more proximity data files that
conform to the format requirements outlined in detail in the Help. Data files are plain text (txt) files
that contain a header followed by the data. Here is a small example of such a file:
data
similarity
5 nodes
0 decimal places
10 minimum value
90 maximum value
lower triangular matrix
32
40 49
32 38 53
73 63 77 18
Properly formatted data files are created by the data collection tools distributed with the Pathfinder
software. The Rate program and the Target program provide ways of obtaining judgments of
relatedness from participants. The data files produced by these programs can be read by the Pathfinder
software. Of course, data files may come from other sources in which case the files must contain an
appropriate header and properly formatted data. Again, see the Help document for details.
The labels for nodes in Pathfinder networks come from files containing terms. The term files are also
plain text (txt) files with one term (word or phrase) per line. There must be as many lines are there are
nodes in the data set(s). If only one set of terms is used, the file should be named, terms.txt. If
different sets of data have different terms, the name of the terms file should match the first part of the
name of the data file. For example, suppose we have a data file named flight.prx.txt. The terms
corresponding to the data in the file could be in a file named terms.txt or a file named flight.trm.txt.
The software will first look for a file with the more specific name and then for a file with the more
general name. If neither file is found, the nodes will be assigned numbers from 1 to n where n is the
number of nodes.
A simple example of the contents of a terms file with 5 terms that might be associated with the sample
data file above is:
Traffic jam
Rush hour
HOV lanes
Accidents
Talking on cell phone
2
Pathfinder Quick Guide
Advice on Selecting Terms
Subject Matter Experts (SMEs) are needed to generate the terms. It is most important to get at least
one SME who (a) really understands the purpose of the research, (b) is articulate, and (c) is committed
to the project. Two or three such people can be of benefit to allow discussing potential terms to reach
consensus.
There are at least two purposes behind modeling knowledge: (1) to model what is important about a
domain, or (2) to discriminate levels of expertise in the domain.
To model the domain in general, we should attempt to identify: (a) the concepts (things, words, terms,
ideas, strategies) that are central to understanding the domain, (b) the relations between concepts that
are central to the domain. The terms involved in these critical relations should be included.
To develop a system for discriminating levels of expertise, including changes that occur with
experience, the focus should be on the differences between novices and experts. That is, try to identify
the concepts and relationships between concepts that experts should understand that novices would not.
Because Pathfinder networks are focused on identifying relationships between concepts, it is critical to
find pairs of concepts that reflect differences in expertise and experience. Some concepts may appear
to be related to novices, but not to experts. Other concepts may come to be seen as related only as a
function of gaining experience in a domain. These are the concepts to include in the list of terms.
Some ways of interacting with experts include: (a) interviewing experts about a syllabus or
curriculum, (b) asking what concepts change dramatically as a novice gains experience, (c) asking
what relationships among concepts change dramatically as a novice becomes expert, (d) asking what
concepts and relationships do experts know much more about than novices.
It is recommended that at least 20 terms be used in Pathfinder studies. Experience suggests that more
is better, but there are practical limitations to what can be expected of participants in providing
proximity judgments.
How to Collect Proximity Data
Pathfinder networks can be generated from proximity data of many kinds. Basically, any measure of
the relations between concepts can be used. For example, studies have been done using counts of the
co-occurrence of terms in paragraphs of text, the frequency of citations between authors, correlations
among various measures, etc. Often proximity data come from judgments of participants about the
relatedness of concepts. One method presents all pairs of the concepts to participants and records
judgments about how related the terms in each pair are. With n terms, this method provides n*(n-1)/2
judgments (all pairs) for each participant. This grows with the square of the number of terms which
makes it impractical for more that 20 to 30 terms. Another method requires participants to move terms
onto a target display to indicate how related they are to the term presented in the middle. By
presenting n displays, with each term in the middle once, an estimate of the relatedness of all pairs can
be generated. Both the pairwise rating and the target method are available with the Pathfinder
software. We expected the target method to be faster for participants to use, but experience has proven
that to be incorrect. Experts do tend to prefer the target method, however.
3
Pathfinder Quick Guide
4
Pathfinder Quick Guide
Pathfinder Control Center
The main screen of the Pathfinder graphical user interface is shown on the preceding page. The screen
depicts the state of the interface after some data have been read in and some networks generated.
The Pathfinder Control Center allows you to initiate the functions accomplished by the software.
The Project Base Folder (PBF) is where the data you analyze and the results you create are stored.
Select a task to perform contains buttons that accomplish critical computations. Each button will
launch a new window to assist you in the task. The buttons are organized in the sequence one usually
follows in working with the software. First, you must Add Data to the project. Then you may want to
Average Data sets to create new, average, data sets. Once data are in the project, you can Derive
Networks of various types. The software can generate nearest neighbor networks and threshold
networks in addition to Pathfinder networks. You can Compare Networks to assess their similarity.
You can export analyses using Save Analyses in Excel File or Save Analyses in Text File.
The export files will contain information about each data set including a measure of the coherence of
the symmetrical data sets. In addition, correlations of the data sets are included in the export files.
Finally, the export files contain the results of the network similarity comparisons you have run.
KNOT Network and Layout Files provides for reading and writing files that are compatible with the
earlier versions of the KNOT software for PC’s and Macs. These functions may not be needed for
projects conducted solely with the new software. Read KNOT Network and Layout Files will allow
you to select layout (.lo) or pfnet (.pf) files created by earlier software and bring them into the project.
Write KNOT Network File will create an old format network or layout file from a network currently in
the project. If you want access to the individual link weights, you can get them from this file.
Help opens the Pathfinder.doc document in Microsoft Word.
Files in Project Folder shows you some files in the PBF. You can limit which files are shown by
setting the filter in the panel. Click on a file name to open that file in a MS Windows application.
Nodes is a list of node labels for a selected network taken from a file.
Project Contents shows how many data sets, networks, and similarity comparisons are in the
workspace at the moment. In a new project, this will be empty. Click on an item to show summary
information about it. Clear Project will erase all of the information stored as a result of analyzing
data for a project, allowing for a fresh start. This will not affect your data files. Only the results of
using the Pathfinder program will be cleared. You can begin again by Adding Data.
Networks shows a list of networks that have been generated in the project. When a network is selected,
the nodes for that network are shown in the Nodes panel. Double clicking on a network will produce a
picture of the network. Selecting a network and a node will produce a picture of the part of the
network focused on that node. All nodes within the Links from Focus Node number of links will be
included in the focus picture. If multiple networks are selected, then Draw Net will produce a network
5
Pathfinder Quick Guide
which merges the links from all selected networks. Delete Selected Nets will remove the selected
networks from the project. The associated proximity data are not affected.
Refresh updates all of the lists on the screen.
Network pictures contains options to assist you in generating pictures of networks. Select networks in
the Networks list window. Selecting one network will result in a picture of that network. Selecting
more than one network will result in the creation of a new network which merges the links from all of
the selected networks. Obviously networks must be compatible (have the same number of nodes) to be
merged.
You can select the alignment of the label with respect to the node location, the color of a box, if any,
and whether to provide a 2 or 3 dimensional display. Because you can rotate the pictures, 3D can often
be useful. However, 3D is not available for directed networks.
Layout Method provides various options for determining the positions of the nodes in the picture.
Select “Previous layout (if available)” if you want to recall the positions from an earlier display of the
same network. If the positions are not available, it will compute new ones. Select “New network
layout” if you want to discard the old positions and generate new ones in a network format. Select
“New tree layout” if you want a tree-style layout. Select “Retain node positions” if you want the
nodes to be in the same position as the last displayed picture. This can facilitate comparing different
networks by keeping the nodes in the same position. If you first generate a merged network with two
network you want to compare, then layout the merged network, next Retain Node Positions while
drawing each of the two networks in turn. Now you can click alternatively on the toolbar icons of the
two nets and see the links change as you switch. Select "Data Coordinates (if available)" to use
coordinates associated with the proximity data used to generate the network. Such coordinates are
available if the data are "coordinates," "features," or "attributes." See the section on Proximity Data
File Formats for details.
Add Link Strength Slider adds a slider to the drawing of a network. The slider allows you to
selectively remove the weakest links in a network in steps until all of the links are removed.
Links from Focus Node specifies which nodes to include in a focus picture on a single node. All nodes
within the specified number of links of the focus node will be included.
Arrowhead Size Factor allows you to control the size of arrowheads in layouts of directed networks –
values less than one will decrease the size, values greater than one will increase the size.
Draw Net simply initiates the network drawing. It must be used when more than one network is
selected. It may be used with only one network, or a drawing of one network can be initiated by
double clicking on the network name in the list of networks.
Click Close All Network Drawings to close all of the picture windows at once. The current node
positions will be saved.
Loops (links from a node to itself) are shown with a border around the name of the node.
6
Pathfinder Quick Guide
Sample Data Sets
To get the feel of using the software, try running an analysis of the sample data files. The files, psy.prx
and bio.prx, are included for you to experiment with.
Data Coherence
When data are read into the project, a measure of coherence is computed for the data set. The
coherence measure reflects the consistency of the data. The coherence of a set of proximity data is
based on the assumption that relatedness between a pair of items can be predicted by the relations of
the items to other items in the set. First, for each pair of items, a measure of relatedness (the indirect
measure) is determined by correlating the proximities between the items and all other items. Then,
coherence is computed by correlating the original proximity data with the indirect measures. The
higher this correlation, the more consistent are the original proximities with the relatedness inferred
from the indirect relationships of the items. The coherence measure often correlates with expertise (or
degree of learning). Very low coherence values (less than 0.20 or so) may indicate that raters did not
(or could not) take the rating task seriously. Very low coherence may indicate an error in entering the
proximity data so that it is scrambled in some way. For example the proximities may be in the wrong
order for the format specified so low coherence is likely to result.
Data Correlations
The data correlations are simply the Pearson product-moment correlations of the data sets. These
correlations provide a measure of the similarity of data sets. These correlations are computed when
data are saved either in excel files or in text files.
Setting Network Generation Parameters
Pathfinder Networks require two parameters, r and q. The r parameter determines how the lengths of
paths are computed. Values of r other than infinity (∞) require data to be on a ratio scale of
measurement. It is generally safe to use r = ∞. The value of q can be varied to produce variations in
the number of links in the networks. Setting q = n-1 (where n is the number of nodes) will result in the
fewest links. Decreasing q will generally increase the number of links in the networks. We often use
q = 2 when computing the network from average data. This adds links that facilitate comparing
networks based on data from individuals. The individual networks often have many links due to ties in
the rating data.
Nearest neighbor networks identify the nearest neighbor(s) of each node with arrows pointing from the
node to its nearest neighbor(s). You can set the number of nearest neighbors for each node. These
networks are not necessarily connected. You can produce connected networks by selecting that option.
In that case, the various disconnected clusters of nodes are connected by the best (most similar or least
dissimilar) links.
Threshold networks simply show the best (most similar or least dissimilar) links. You can set a target
number of links and the software will find at least that many. There may be more in case there are ties
in the data.
7
Pathfinder Quick Guide
Comparing Networks
If you select multiple networks, in the network pane, you will see a network with all of the links in any
of the selected networks. If two networks are selected, you will see which links are in each of the
networks and which are in both networks. When more networks are selected, the links are identified
by how many networks included the link. Here is an example with two networks:
There are also statistical measures of similarity that can be computed using the Compare Networks
button on the Pathfinder Control Center.
Network Layouts (Pictures)
Here is a picture of a displayed network. The title shows the name of the networks and the q and r
parameters used to generate it. In this case the network is displayed with no box around the label and
the labels are positions to the right of the node position. The tools allow you to move it, resize it, and
rotate it. The changes you make to its appearance are saved so when you view it again with “Previous
layout (if available),” it will show up just as you left it. Rotations with 3-D layouts are especially
impressive. The File menu has the usual options for printing and saving. You can save the figure
using various formats to enable you to get the image into other applications. It is also helpful to use
screen capture software (e.g., SnagIt) to grab and manipulate the image.
8
Pathfinder Quick Guide
You can click on a node and drag it to move it to a new position (links follow). When you move the
nodes to new positions, the new node positions are saved so future layouts of the same network will
appear as you left it provided you select Previous layout (if available) in the Layout Method options.
If you double click on a node, a network focused on the clicked node will appear.
As you move the slider on the left-hand side, links will drop out of the network in the inverse order of
strength. Weaker links will disappear first. At the top, no links will be present. As the slider is moved
down, the strongest links will appear in order of strength.
9
Download