File Association - How the Data-oriented Approach to Computer

advertisement
Windows® File Association
Table of Contents
What's in a name? ........................................................................................................................................ 1
File Classification ........................................................................................................................................... 2
User Interfaces .............................................................................................................................................. 3
Methods of Control ....................................................................................................................................... 3
File Association - How the Data-oriented Approach to Computer Control Works....................................... 4
Weaknesses of the File Association System ................................................................................................. 5
How to tell what your computer knows ....................................................................................................... 5
How get your computer to tell the whole truth ........................................................................................... 6
Windows® File Association
Of all the concepts that one should learn in order to easily understand and use the Windows®
operating system, the most pivotal is undoubtedly the concept of file association. Oddly, this
topic is seldom directly addressed in computer textbooks or tutorials. Rather it is slowly learned
indirectly as the result of gradual recognition of the manner in which our computers seem to
respond to our actions. And yet file association directly affects how we must execute even the
most basic commands on Windows-based computers and why one computer often reacts
differently to a command or user action than another computer. The purpose of this document is
to explain in basic terms what file association is and how you can make use of it to control your
computer.
Like almost all topics in computers, this one is fundamentally based on the concept of data
languages. Readers are advised to review the web page about computer data types and languages
prior to reading this document to develop a literacy regarding terms such as "text", "binary", and
"extension".
What's in a name?
Most computer users know that they are expected to assign a label to every file of data that they
store regardless of which program is doing the saving. We can deduce from this that the process
of naming files is performed by the operating system (Windows in our case) rather than by the
many individual applications programs (such as word processors or accounting systems). Every
operating system defines rules that must be followed when selecting filenames. Some operating
systems use filenames simply as unique labels to allow easy retrieval of the data later. But other
systems place far more importance on these names than just using them as simple labels.
Windows allows users to assign a suffix known as an "extension" to the end of a file label
following a period to be used as an indicator of the data type (computer language) contained
within the file. File extensions are examined by Windows and applications programs to
determine what actions are appropriate with each file based on its data type. For example, the
extension ".exe" (short for executable) is often used to indicate that a file contains machine
language (program instructions). Such files cannot be displayed on a computer screen without
prior translation because they do not contain the "text" type of data that screens can decipher.
Their language is understood only by processors. As such, the most appropriate action to perform
upon these files is to send them to the processor for execution. Windows was written to
recognize this fact. Thus when a user sees a file named "game.exe" listed in a folder, the user can
execute that program by typing simply game, rather than having to include the extension and type
game.exe (which is also allowed).
It is important to realize and remember that file extensions are used only as indicators of a file's
data type, but not as guarantees. In other words, a file's extension is not what defines it's data
type. Files can be improperly labeled with the wrong extension. For example, simply recording
the extension ".exe" at the end of a filename does not make the file executable. For a file to be
executable, it must be recorded by a programming language translator that knows the machine
language of the type of processor on which it will be run. The definition of a file's data type is a
function of the language abilities of the program that writes it and actually has no dependence on
the extension used to label the file.
Randy Gibson
Page 1
Windows® File Association
The computer industry has standardized (somewhat) upon common usage of extensions to
indicate specific data types, but there are many variations, dualities, and conflicts as well. Here
are some basic facts about extensions to help you develop a sense of how they are used in
Windows.









In current versions of Windows, a filename may contain more than one period, and all the
characters following the rightmost period are interpreted as being the extension.
Earlier version of Windows (and its predecessor DOS) restricted extensions to having a maximum of
3 characters, but that limit no longer exists.
Although Windows can be configured to remember your use of uppercase and lowercase letters
when naming a file, it does not care about it. Windows is "case-aware", but it is not "case-sensitive".
Your choice of an extension is arbitrary, although there is some extension usage that has become
quite common (such as using ".txt" to indicate a "text file"). Also, many software companies have
adopted specific extensions to act as "signatures" for their files (such as ".xls" to indicate a
"Microsoft® Excel Spreadsheet").
You are not required to follow common naming conventions when assigning a filename extension,
but failure to do so may affect how programs treat your file. Thus, if you choose to label a program
file with an extension of ".txt", Windows will attempt to treat it like a text file in many situations,
even if it is not a text file.
A filename is not required to have an extension, but failure to use one will limit a user's ability to
easily use a file. For example, if you save a text file and name it simply "myfile" but not "myfile.txt",
you will not be able to load that text file by just double-clicking on its filename. Instead, you will
have to first load a program that knows how to read text (know as being "text literate") and then
use the filing features within that program to open the file.
Many programs assign extensions automatically for the user when saving a file, but the user can
override the program's actions by simply enclosing the entire filename (including the extension)
inside of quotes when typing the filename.
Windows® is often configured to hide some extensions from view when displaying filenames. This
was an unfortunate design choice by the authors, as its behavior does not appear to be consistent
without an intimate understanding of how Windows makes its decisions about file association.
(Remember "file association"? We will get to that eventually, I promise).
The icon (small graphic image) that is often displayed near a filename in listings such as folders and
menus is normally dependent on which extension was used when the file was named.
File Classification
Operating systems such as Windows treat all stored files as belonging to one of two major
categories:


Program Files - which contain instructions (in a variety of different programming languages) for the
computer to perform. Each program file has a name which can be typed by the user as a keyword
which will load (copy) the instructions from within the program file into the computer's main
memory and then execute them. When a user types a command, the name of the program file
serves as the verb.
Data Files - which contain the data that programs manipulate. The names of these files are used as
objects when typing commands. Data files contain the knowledge that is important to us, but they
are of little use without program files to manipulate them.
Randy Gibson
Page 2
Windows® File Association
User Interfaces
An essential component of every PC operating system is a program (or set of programs) called
the user interface. This software addresses a fundamental limitation of today's computers - they
don't understand human languages such as English. Computers can manipulate many languages,
but they do not understand them. In fact, the only language that is native to a computer is the
binary machine language of its processor (which is not intelligible to humans). A user interface is
the part of the operating software that allows people and computers to interact with each other.
User interfaces translate our actions (such as typing commands or clicking on icons) into
machine instructions recognizable by the processor in our computer.
User interfaces come in a few basic types:

Command-line Interfaces - These require users to type lines of text-based commands on a
keyboard. When using them, users must learn and follow strict rules of syntax in the same
way that English users are expected to place the subject of a sentence in front of the verb.
Before a you can use this type of interface, you must learn its rules of syntax. Windows
offers this method of control to its users through the Start Menu via the choice Run which
activates a small dialog box in which the user can type a command line. If you wanted to edit
a picture file named "mypic.bmp" using the Microsoft® Paint program, you could type the
name of the program file "mspaint.exe" first, followed by the name of the data file
"mypic.bmp" that the program is expected to manipulate. Thus, the complete command line
would be
mspaint.exe mypic.bmp
The command-line interface would translate that line of text into machine language
instructions and submit them to the processor for execution.


Many programs are now written using Menu-driven Interfaces. We interact with them by simply
viewing a menu of choices on the screen and pressing a letter or numeral to indicate the command
which we would like to use. The act of pressing the key is translated by the menu-driven interface
software into machine language instructions for the processor.
Presently, the most popular user interface is the Graphic User Interface (or GUI - pronounced "goo'ee") which involves the display of small pictures known as icons that serve as graphic menu choices.
GUI's allow us to use pointing devices (such as the arrow keys on a keyboard or a hand-held mouse)
to indicate desired choices. Actions taken with the mouse such as clicking or dragging are translated
by the GUI into machine language instructions for the processor. The primary graphic element
offered by the Windows operating system is the "desktop", a collection of icons, buttons, and popup menus on the screen that can be used to pass commands to the processor.
Methods of Control
Software designers who write operating software such as Windows must devise a method for
users to follow when they want to load a program and manipulate a data file. Historically, two
opposite approaches have been used for this purpose.
Randy Gibson
Page 3
Windows® File Association


The Program-Oriented Approach - in which the program is loaded first, and then used to load the
data file. This approach assumes that the user knows which program to use. If this is the only
method of control offered by an operating system and the user does not know which program to
use, the only recourse is trial and error.
The Data-Oriented Approach - focuses first on the data file. A user starts by indicating which data
file is to be manipulated, and then the operating system determines which program file must be
loaded to manipulate the selected data file. This approach relieves the user from having to know
which program to use, but now relies on the operating system to know. This method is easier on the
user, provided that the operating system chooses the correct program. The question is, how does
the operating system know which program to use? This is where file association comes in. (See,
we're getting closer.)
The program-oriented approach has been the one most commonly used by computer
programmers over the years. Most command-line user interfaces use this method of control. In
the example provided above about how to load and use the Paint program to manipulate a picture
file named "mypic.bmp", the command line was:
mspaint.exe mypic.bmp
The use of the ".exe;" extension is optional, but the order of the words on the command line is
not. The program name is typed first (acting as a verb), followed by the name of the data file
(serving as an object to be acted upon). This format indicates first what program is being loaded
by the command. The name of the data file that follows is subsequently passed to the program
(once it is loaded into memory) so that it knows what data file to load. This program-oriented
approach also can be used in the graphic (point-and-click) interface of control offered by the
Windows Desktop. In this approach, a user starts by clicking on the [Start] button and selects
the name of the program to be loaded. After it is loaded, the user uses the program's menus or
quick keystrokes to load (or "open") the data file.
It has been discovered after studying the preferences of ordinary computer users (nonprogrammers) that a majority of them prefer the data-oriented approach for its simplicity. This
means that laypeople focus more on their data than on the processes that manipulate that data.
However, the data-oriented approach is not free from problems. It relies on the operating system
to always know which program should be used to manipulate each data file. It is not unusual for
us to have more than one program that is capable of manipulating a given type of data. In this
event, you might choose to use the program-oriented approach to controlling your computer,
where you select the preferred program first and then use it to load the data. The other option is
to rely on file association. (Finally - we're here!)
File Association - How the Data-oriented Approach to Computer Control
Works
In order to facilitate the data-oriented approach to computer control, Windows maintains a table
that associates different types of data files with specific programs on your computer. Microsoft
refers to this table as the list of "registered file types" (although many in the industry call it the
file association table). It is stored on your hard disk in a group of files called the registry. (In
Randy Gibson
Page 4
Windows® File Association
fact, the registry is where almost all of the essential Windows configuration data is stored.)
When Windows is installed on a computer, a basic table is stored containing a list of the most
widely known and used file types, such as: text, bit-mapped graphics, executable machine
language, hypertext documents (web pages), etc. Although this table starts out the same for each
version of Windows, it quickly becomes unique on every computer. As software is installed on
each computer, this table is updated and new file types are added to the list. For each file type,
Windows records:





A name to identify it.
One or more filename extensions that will be used to indicate its use.
The name and location of one program on the computer to run whenever a user attempts to open a
file of that data type by choosing one of them from within a folder, either by single or double
clicking (depending on the version of Windows).
Newer versions of Windows also record the names and locations of programs to use when the user
indicates that a file should be either edited or printed. These are selected by right-clicking the file
from within a folder list and then selecting either Edit or Print from a pull-out menu.
Note: The most recent versions of Windows optionally also record the names and locations of
alternate programs that could be selected for execution if a user attempts to open a file of that data
type from within a folder by right-clicking it from within a folder list and then selecting from a pullout menu entitled "Open With...".
Weaknesses of the File Association System
Users need to be aware that the list of "registered file types" maintained by Windows is not
comprehensive. Many types of files can be stored on a computer that are not listed in the
registry. This can happen if you save a data file and give it a non-standard or incorrect extension,
or if you copy a file to your computer that was not recorded by one of the programs on your
computer. You may have experienced a situation where you attempted to activate a data file that
you saw listed in a folder and received an error message asking what program to use to open the
file. This means that the extension on that file was not listed in the registry on the computer that
you were using. It does not mean that you don't have a program capable of reading the file; it
simply means that there is no record of which program can do so listed in the registry of your
computer. You would still have the option of using the program-oriented approach to open the
data file, but you might have to do so by trial-and-error.
How to tell what your computer knows
If you look at the icon displayed along with each file in folders, you will notice that some of
them match those of programs on your computer, and others do not. For example, the icon
displayed next to files with a ".txt" extension is usually the icon for the Windows® Notepad (text
editor) program. If you see an icon next to a file that has the generic Windows icon , then that
file's extension (and therefore its data type) is not recorded in the registry of your computer. You
can add any file extension to the list of registered file types. The process differs a bit depending
on which version of Windows you have, but basically it is accessed like this:
1. Open any folder window.
Randy Gibson
Page 5
Windows® File Association
2. Click on the menu choice Tools (or View if Tools is not there).
3. Select Folder Options from the menu.
4. Click on the tab labeled File Types at the top of the dialog box that appears.
From here you can view or edit the list, but do so with extreme caution. Mistakes here can have
serious effects on how your computer behaves. For more information, use the help features in
Windows.
How get your computer to tell the whole truth
If you look at the file labels listed in folders, you will notice that some of them do not show their
extension. Windows has a setting that is stored in the registry initially that hides the extensions
of known (listed) file types. You can disable this setting to have Windows show you the
extensions all files. The process differs a bit depending on which version of Windows you have.
For Windows XP, the steps are:
1.
2.
3.
4.
5.
Open any folder window.
Click on the menu choice Tools (or View if Tools is not there).
Select Folder Options from the menu.
Click on the tab labeled View at the top of the dialog box that appears.
Clear (click on) the check mark next to the item in the list that says "Hide file extensions for
known file types".
6. Click on the OK button at the bottom of the dialog box.
For the procedure in Windows Vista, see the Microsoft web page at:
http://windows.microsoft.com/en-us/windows-vista/Show-or-hide-file-nameextensions
For the procedure in Windows 7, see the Microsoft web page at:
http://windows.microsoft.com/en-US/windows7/Show-or-hide-file-name-extensions
You should now see all file extensions listed when viewing file names in your folders. Note that
you will be expected to also maintain these extensions yourself when naming or renaming files if
you alter this setting to stop hiding the known extensions.
For more information about how the Windows user interface works, view the Tutorial Videos
About Windows XP on this web site.
Randy Gibson
Page 6
Download