Windows® File Association Table of Contents What's in a name? ........................................................................................................................................ 1 File Classification ........................................................................................................................................... 2 User Interfaces .............................................................................................................................................. 3 Methods of Control ....................................................................................................................................... 3 File Association - How the Data-oriented Approach to Computer Control Works....................................... 4 Weaknesses of the File Association System ................................................................................................. 5 How to tell what your computer knows ....................................................................................................... 5 How get your computer to tell the whole truth ........................................................................................... 6 Windows® File Association Of all the concepts that one should learn in order to easily understand and use the Windows® operating system, the most pivotal is undoubtedly the concept of file association. Oddly, this topic is seldom directly addressed in computer textbooks or tutorials. Rather it is slowly learned indirectly as the result of gradual recognition of the manner in which our computers seem to respond to our actions. And yet file association directly affects how we must execute even the most basic commands on Windows-based computers and why one computer often reacts differently to a command or user action than another computer. The purpose of this document is to explain in basic terms what file association is and how you can make use of it to control your computer. Like almost all topics in computers, this one is fundamentally based on the concept of data languages. Readers are advised to review the web page about computer data types and languages prior to reading this document to develop a literacy regarding terms such as "text", "binary", and "extension". What's in a name? Most computer users know that they are expected to assign a label to every file of data that they store regardless of which program is doing the saving. We can deduce from this that the process of naming files is performed by the operating system (Windows in our case) rather than by the many individual applications programs (such as word processors or accounting systems). Every operating system defines rules that must be followed when selecting filenames. Some operating systems use filenames simply as unique labels to allow easy retrieval of the data later. But other systems place far more importance on these names than just using them as simple labels. Windows allows users to assign a suffix known as an "extension" to the end of a file label following a period to be used as an indicator of the data type (computer language) contained within the file. File extensions are examined by Windows and applications programs to determine what actions are appropriate with each file based on its data type. For example, the extension ".exe" (short for executable) is often used to indicate that a file contains machine language (program instructions). Such files cannot be displayed on a computer screen without prior translation because they do not contain the "text" type of data that screens can decipher. Their language is understood only by processors. As such, the most appropriate action to perform upon these files is to send them to the processor for execution. Windows was written to recognize this fact. Thus when a user sees a file named "game.exe" listed in a folder, the user can execute that program by typing simply game, rather than having to include the extension and type game.exe (which is also allowed). It is important to realize and remember that file extensions are used only as indicators of a file's data type, but not as guarantees. In other words, a file's extension is not what defines it's data type. Files can be improperly labeled with the wrong extension. For example, simply recording the extension ".exe" at the end of a filename does not make the file executable. For a file to be executable, it must be recorded by a programming language translator that knows the machine language of the type of processor on which it will be run. The definition of a file's data type is a function of the language abilities of the program that writes it and actually has no dependence on the extension used to label the file. Randy Gibson Page 1 Windows® File Association The computer industry has standardized (somewhat) upon common usage of extensions to indicate specific data types, but there are many variations, dualities, and conflicts as well. Here are some basic facts about extensions to help you develop a sense of how they are used in Windows. In current versions of Windows, a filename may contain more than one period, and all the characters following the rightmost period are interpreted as being the extension. Earlier version of Windows (and its predecessor DOS) restricted extensions to having a maximum of 3 characters, but that limit no longer exists. Although Windows can be configured to remember your use of uppercase and lowercase letters when naming a file, it does not care about it. Windows is "case-aware", but it is not "case-sensitive". Your choice of an extension is arbitrary, although there is some extension usage that has become quite common (such as using ".txt" to indicate a "text file"). Also, many software companies have adopted specific extensions to act as "signatures" for their files (such as ".xls" to indicate a "Microsoft® Excel Spreadsheet"). You are not required to follow common naming conventions when assigning a filename extension, but failure to do so may affect how programs treat your file. Thus, if you choose to label a program file with an extension of ".txt", Windows will attempt to treat it like a text file in many situations, even if it is not a text file. A filename is not required to have an extension, but failure to use one will limit a user's ability to easily use a file. For example, if you save a text file and name it simply "myfile" but not "myfile.txt", you will not be able to load that text file by just double-clicking on its filename. Instead, you will have to first load a program that knows how to read text (know as being "text literate") and then use the filing features within that program to open the file. Many programs assign extensions automatically for the user when saving a file, but the user can override the program's actions by simply enclosing the entire filename (including the extension) inside of quotes when typing the filename. Windows® is often configured to hide some extensions from view when displaying filenames. This was an unfortunate design choice by the authors, as its behavior does not appear to be consistent without an intimate understanding of how Windows makes its decisions about file association. (Remember "file association"? We will get to that eventually, I promise). The icon (small graphic image) that is often displayed near a filename in listings such as folders and menus is normally dependent on which extension was used when the file was named. File Classification Operating systems such as Windows treat all stored files as belonging to one of two major categories: Program Files - which contain instructions (in a variety of different programming languages) for the computer to perform. Each program file has a name which can be typed by the user as a keyword which will load (copy) the instructions from within the program file into the computer's main memory and then execute them. When a user types a command, the name of the program file serves as the verb. Data Files - which contain the data that programs manipulate. The names of these files are used as objects when typing commands. Data files contain the knowledge that is important to us, but they are of little use without program files to manipulate them. Randy Gibson Page 2 Windows® File Association User Interfaces An essential component of every PC operating system is a program (or set of programs) called the user interface. This software addresses a fundamental limitation of today's computers - they don't understand human languages such as English. Computers can manipulate many languages, but they do not understand them. In fact, the only language that is native to a computer is the binary machine language of its processor (which is not intelligible to humans). A user interface is the part of the operating software that allows people and computers to interact with each other. User interfaces translate our actions (such as typing commands or clicking on icons) into machine instructions recognizable by the processor in our computer. User interfaces come in a few basic types: Command-line Interfaces - These require users to type lines of text-based commands on a keyboard. When using them, users must learn and follow strict rules of syntax in the same way that English users are expected to place the subject of a sentence in front of the verb. Before a you can use this type of interface, you must learn its rules of syntax. Windows offers this method of control to its users through the Start Menu via the choice Run which activates a small dialog box in which the user can type a command line. If you wanted to edit a picture file named "mypic.bmp" using the Microsoft® Paint program, you could type the name of the program file "mspaint.exe" first, followed by the name of the data file "mypic.bmp" that the program is expected to manipulate. Thus, the complete command line would be mspaint.exe mypic.bmp The command-line interface would translate that line of text into machine language instructions and submit them to the processor for execution. Many programs are now written using Menu-driven Interfaces. We interact with them by simply viewing a menu of choices on the screen and pressing a letter or numeral to indicate the command which we would like to use. The act of pressing the key is translated by the menu-driven interface software into machine language instructions for the processor. Presently, the most popular user interface is the Graphic User Interface (or GUI - pronounced "goo'ee") which involves the display of small pictures known as icons that serve as graphic menu choices. GUI's allow us to use pointing devices (such as the arrow keys on a keyboard or a hand-held mouse) to indicate desired choices. Actions taken with the mouse such as clicking or dragging are translated by the GUI into machine language instructions for the processor. The primary graphic element offered by the Windows operating system is the "desktop", a collection of icons, buttons, and popup menus on the screen that can be used to pass commands to the processor. Methods of Control Software designers who write operating software such as Windows must devise a method for users to follow when they want to load a program and manipulate a data file. Historically, two opposite approaches have been used for this purpose. Randy Gibson Page 3 Windows® File Association The Program-Oriented Approach - in which the program is loaded first, and then used to load the data file. This approach assumes that the user knows which program to use. If this is the only method of control offered by an operating system and the user does not know which program to use, the only recourse is trial and error. The Data-Oriented Approach - focuses first on the data file. A user starts by indicating which data file is to be manipulated, and then the operating system determines which program file must be loaded to manipulate the selected data file. This approach relieves the user from having to know which program to use, but now relies on the operating system to know. This method is easier on the user, provided that the operating system chooses the correct program. The question is, how does the operating system know which program to use? This is where file association comes in. (See, we're getting closer.) The program-oriented approach has been the one most commonly used by computer programmers over the years. Most command-line user interfaces use this method of control. In the example provided above about how to load and use the Paint program to manipulate a picture file named "mypic.bmp", the command line was: mspaint.exe mypic.bmp The use of the ".exe;" extension is optional, but the order of the words on the command line is not. The program name is typed first (acting as a verb), followed by the name of the data file (serving as an object to be acted upon). This format indicates first what program is being loaded by the command. The name of the data file that follows is subsequently passed to the program (once it is loaded into memory) so that it knows what data file to load. This program-oriented approach also can be used in the graphic (point-and-click) interface of control offered by the Windows Desktop. In this approach, a user starts by clicking on the [Start] button and selects the name of the program to be loaded. After it is loaded, the user uses the program's menus or quick keystrokes to load (or "open") the data file. It has been discovered after studying the preferences of ordinary computer users (nonprogrammers) that a majority of them prefer the data-oriented approach for its simplicity. This means that laypeople focus more on their data than on the processes that manipulate that data. However, the data-oriented approach is not free from problems. It relies on the operating system to always know which program should be used to manipulate each data file. It is not unusual for us to have more than one program that is capable of manipulating a given type of data. In this event, you might choose to use the program-oriented approach to controlling your computer, where you select the preferred program first and then use it to load the data. The other option is to rely on file association. (Finally - we're here!) File Association - How the Data-oriented Approach to Computer Control Works In order to facilitate the data-oriented approach to computer control, Windows maintains a table that associates different types of data files with specific programs on your computer. Microsoft refers to this table as the list of "registered file types" (although many in the industry call it the file association table). It is stored on your hard disk in a group of files called the registry. (In Randy Gibson Page 4 Windows® File Association fact, the registry is where almost all of the essential Windows configuration data is stored.) When Windows is installed on a computer, a basic table is stored containing a list of the most widely known and used file types, such as: text, bit-mapped graphics, executable machine language, hypertext documents (web pages), etc. Although this table starts out the same for each version of Windows, it quickly becomes unique on every computer. As software is installed on each computer, this table is updated and new file types are added to the list. For each file type, Windows records: A name to identify it. One or more filename extensions that will be used to indicate its use. The name and location of one program on the computer to run whenever a user attempts to open a file of that data type by choosing one of them from within a folder, either by single or double clicking (depending on the version of Windows). Newer versions of Windows also record the names and locations of programs to use when the user indicates that a file should be either edited or printed. These are selected by right-clicking the file from within a folder list and then selecting either Edit or Print from a pull-out menu. Note: The most recent versions of Windows optionally also record the names and locations of alternate programs that could be selected for execution if a user attempts to open a file of that data type from within a folder by right-clicking it from within a folder list and then selecting from a pullout menu entitled "Open With...". Weaknesses of the File Association System Users need to be aware that the list of "registered file types" maintained by Windows is not comprehensive. Many types of files can be stored on a computer that are not listed in the registry. This can happen if you save a data file and give it a non-standard or incorrect extension, or if you copy a file to your computer that was not recorded by one of the programs on your computer. You may have experienced a situation where you attempted to activate a data file that you saw listed in a folder and received an error message asking what program to use to open the file. This means that the extension on that file was not listed in the registry on the computer that you were using. It does not mean that you don't have a program capable of reading the file; it simply means that there is no record of which program can do so listed in the registry of your computer. You would still have the option of using the program-oriented approach to open the data file, but you might have to do so by trial-and-error. How to tell what your computer knows If you look at the icon displayed along with each file in folders, you will notice that some of them match those of programs on your computer, and others do not. For example, the icon displayed next to files with a ".txt" extension is usually the icon for the Windows® Notepad (text editor) program. If you see an icon next to a file that has the generic Windows icon , then that file's extension (and therefore its data type) is not recorded in the registry of your computer. You can add any file extension to the list of registered file types. The process differs a bit depending on which version of Windows you have, but basically it is accessed like this: 1. Open any folder window. Randy Gibson Page 5 Windows® File Association 2. Click on the menu choice Tools (or View if Tools is not there). 3. Select Folder Options from the menu. 4. Click on the tab labeled File Types at the top of the dialog box that appears. From here you can view or edit the list, but do so with extreme caution. Mistakes here can have serious effects on how your computer behaves. For more information, use the help features in Windows. How get your computer to tell the whole truth If you look at the file labels listed in folders, you will notice that some of them do not show their extension. Windows has a setting that is stored in the registry initially that hides the extensions of known (listed) file types. You can disable this setting to have Windows show you the extensions all files. The process differs a bit depending on which version of Windows you have. For Windows XP, the steps are: 1. 2. 3. 4. 5. Open any folder window. Click on the menu choice Tools (or View if Tools is not there). Select Folder Options from the menu. Click on the tab labeled View at the top of the dialog box that appears. Clear (click on) the check mark next to the item in the list that says "Hide file extensions for known file types". 6. Click on the OK button at the bottom of the dialog box. For the procedure in Windows Vista, see the Microsoft web page at: http://windows.microsoft.com/en-us/windows-vista/Show-or-hide-file-nameextensions For the procedure in Windows 7, see the Microsoft web page at: http://windows.microsoft.com/en-US/windows7/Show-or-hide-file-name-extensions You should now see all file extensions listed when viewing file names in your folders. Note that you will be expected to also maintain these extensions yourself when naming or renaming files if you alter this setting to stop hiding the known extensions. For more information about how the Windows user interface works, view the Tutorial Videos About Windows XP on this web site. Randy Gibson Page 6