Panayiotis Christodoulou Objectives The troubleshooting process and the thinking skills required for successful trouble shooting Communication skills for troubleshooting Information resources to help solve computer problems Which tools are used to troubleshoot computer problems Strategies for troubleshooters How to develop a personal problem-solving strategy Objectives •he types of common end-user computer T problems How problem-solving processes are applied • to several typical support problems Computer problems can come in a variety of forms, however, most problems fall into one of six categories. Common problem categories include hardware problems, software problems, user problems, documentation problems, vendor problems and facilities problems. Hardware problems generally stem from one of three sources. Hardware installation and compatibility problems tend to occur when a user purchases new hardware or upgrades an old hardware product. Incompatible computer components are those that cannot operate together on a system. Hardware configuration problems are difficulties that occur when hardware (or software) settings are incorrect for the computer environment in which a component must operate. The adoption of Plug and Play standards, which are industry-wide agreements among hardware and operating system vendors about hardware installation and configuration options, has helped to greatly minimize hardware configuration problems. In addition to incompatibility and configuration problems, a small percentage of hardware problems result from components that either have never worked or no longer work. In an attempt to avoid future hardware problems once a system is installed, support staff either at the vendor or the worksite burn-in a machine prior to the user receiving the system. Burn-in is a 48- to 72-hour period during which a new computer or component is operated nonstop in an attempt to discover obvious problems and identify any marginal or temperature-sensitive components. Vendors also generally include hardware diagnostic tools with a new system that can help a user support specialist detect common hardware malfunctions. Software installation and upgrades are easier today than in the past. Problems with software occur most often during installation. Installation software is special-purpose utility software that aids in the installation of other software packages. The installation software can automatically create all of the subdirectories with correct path names, examine the hardware configuration to determine whether the software and hardware are compatible, and set configuration options in the software and the operating system to match the hardware. User support specialists are often called upon to deal with installation problems or they are asked to install older software that does not install automatically. Another source of compatibility problems can be shareware. Shareware is commercial software that users can try out with the vendor’s permission during an evaluation period (usually 30 days) prior to making a purchase decision. Some shareware can cause conflicts because of the way that it is written. A conflict is a state in which a computer component uses systems resources (CPU, memory, or peripheral devices) in a way that is incompatible with another component. Some software problems are related to the way that software is configured to run on a system. When software options are not set up properly it can result in configuration problems. Starting with Windows 95, software and hardware configuration information is saved in a large system file called the Registry. It is possible to edit the Registry with a software utility called REGEDIT.EXE. Users and support specialists should not edit the registry unless they are familiar with registry entries and how to modify them. Bugs are errors in a computer program that occur when a programmer writes incorrectly coded instructions during program development. Bugs are more frequently found in custom developed software rather than mass-market programs. Many bugs are eliminated during the testing phase, however, there are updates and bug fixes for software sometimes several years after the initial public release. There are different ways that software publisher classify software. A new version of a software package contains significant, new features and is usually the result of a substantially rewritten program. An upgrade is a new version of an existing program that is sold at a reduced cost to owners of a previous version of the program. A new release of a program is a distribution that contains some new features not found in the original program. An update is a bug-fix distribution that repairs known problems in a previous version or release of a software package. A patch is a replacement for one or a few modules in a software package to fix one or more known bugs. A service pack (or service release) contains both updates and patches to fix documentation Before a user installs a patch they should check with a support specialist. They should remember to keep a copy of the patch installation file so that they can reinstall the patch or patches if they need to reinstall the software from the original media. Some vendors incorporate prior patches into the next patch release. Others require you to install prior patches before installing the new patch. In some cases where patches are not available, a support specialist or vendor might suggest a workaround. A workaround is a procedure or feature that accomplishes the same result as a feature that does not work due to a bug or other malfunction. Performance problems are a category of computer problems whereby a system is operational, but does not operate as efficiently as it can or should. Performance problems can be an indication of hardware problems, however, before a hardware component is replaced the software problems should be explored. For example, a slow hard drive could be an indication that the hard drive is about to fail, however, there are other problems that should be explored first including checking how full the hard drive is, defragmenting the hard drive, scanning for lost space or there might not be enough RAM to run software efficiently. Users can unintentionally cause many support problems. All users, including professional support staff, make mistakes. Despite the best efforts of software developers, users occasionally press a wrong key and end up in part of a program that they did not intend to load. Some key sequences are invalid and have no effect on the project that the user is working with. Other key sequences can have drastic consequences. Even when a user is at faculty, a support specialist needs to be very careful to tactfully guide a user toward training opportunities rather than assigning blame. Users frequently purchase an incorrect product either due to a misunderstanding about product features or limitations. Sometimes users will purchase the wrong product to accomplish a task. It is common for someone that owns an older model PC system to purchase a software package that requires a later model processor or a user may purchase the wrong version such as a Macintosh version. Many computer problems arise because users are poorly trained or they do not read the documentation for the software or hardware. Quick start behavior is a tendency among computer users to forego reading the installation manual and attempt to get new hardware or software installed and operational as rapidly as possible. Lack of adequate training and understanding of the software can translate into waste and lost user productivity. Even when training has been adequate, users tend to forget how to perform tasks or information such as passwords. Reference sheets (formal or informal notes) and scripts are an effective aid to help users recall how to perform infrequent tasks that are difficult to remember. Documentation can also cause a problem. In recent years vendor documentation has improved, however, poorly organized and inaccurate documentation causes many frustrated users and generates volumes of user support calls. The best user documentation includes a quick start tutorial. Many calls to help desks or hotlines are often related to a vendor overselling a product. Vendors often promise features to customers during development that are not actually in the final product. Also, vendors may release products that have bugs due to time constraints. Vaporware refers to hardware or software products that appear in ads or press releases but that are not yet available for sale. Facility problems include support calls about such things as viruses, back up media, security, and ergonomic issues. Most support problems can be grouped into the six categories mentioned above. Networks are also a frequent source of problems and are among the problems that are most difficult to solve because they involve the interaction of hardware and software components. User support staff use the problems that they solve as a bank of knowledge to solve new problems. The more of a variety of problems that user support specialists solve, the more knowledge and experience they will have when working on new or more complex problems. The book has six problem examples that are accounts from support specialists about how they resolved some typical support problems. Below is a summery of the support cases. Please refer to your book for the examples. In the first problem a user reported a problem with her sound card. The support representative suggested that the user first try some of the obvious fixes such as resetting the sound card in the expansion slot and checking the connections and cables to the sound card. After trying several more suggestions from his support colleagues, he asked the user if she had made any changes to the system around the time that the problem started. He suggested that she download the latest driver for her sound card. The reinstallation of the sound card software solved the problem. The user specialist used communication skills and critical questions to open up a different avenue of investigation. He also used troubleshooting strategies including looking for a quick fix, obvious fix, some hypothesis testing and module replacement. In problem two, a user had problems accessing the Internet. The modem usually dialed the ISP’s computer successfully, but occasionally it would report that it could not get a dial tone. The technical support representative checked the modem and asked the user if she had any special features on her phone line. The user reported that it was a standard line. The modem checked out and the support specialist decided to plug a handset into the line to make sure that there were no problems with the phone line. The phone line had voice mail waiting. Once the voice mail was cleared, the modem dialed in without a problem. The support representative suggested that the user clear the voice mail on her line before she tried to connect to her ISP. The support representative said that he learned that users do not always know the answers to the questions. The user’s answers threw him off track. He used his personal experience, eliminated variables and hypothesis testing to solve this problem. Problem three involves credit card process software. The software vendor said the problem was likely on the credit card processor’s end and the credit card processor pointed a finger at the software. The support representative installed the software on his own machine and uses the documentation to learn how to operate the software. The resolution was that the operator had entered corrupt data that caused the report not to run properly. The support representative used trial and error as well as hypothesis testing. He also tried to replicate the problem. Problem four involves a possible hard drive crash. The user started the machine and received an error message “Non system disk or disk error.” When it was determined that there was not a disk in drive A: and rebooting the system did not solve the problem, the user wanted to jump ahead to talking about data recovery. The support person booted from the A: drive and looked at the hard drive. After looking at the system files it was discovered that the MSDOS.SYS file had 0 bytes. By copying this file from another machine, the support representative was able to fix the problem. The support person used a variety of troubleshooting strategies including looking for a quick, obvious fix, prior knowledge, help from colleagues and a process of elimination to find the solution. Problem five concerns a user that deleted the drive mappings for a network software package while doing some system housekeeping. After looking at the path settings for the icon and determining that the .exe file was not deleted, the support representative had the user reboot the machine. By rebooting the machine the drive settings were remapped and the user was able to access the software once again. The user support specialist used communication skills to listen to the user’s definition of a problem, paraphrasing, and asking critical questions to formulate a hypothesis. Problem six involves a slow network connection. When the support specialist checked the network server, he found that the server light was on but the monitor was blank. He tried switching out several components to see if the video card or monitor was the problem. When a known working monitor and video card were installed in the server and it still did not function properly, the support representative switched the server hard drive into a backup server. It booted without a problem. The motherboard was determined to be the problem and the repair shop confirmed this. The support representative relied heavily on a module replacement strategy as well as a knowledge of how the subsystems in a computer a linked. The final problem, problem seven, involves a computer that keeps losing time. It is important to the user that the computer clock stay accurate because of posting that she does in the accounting program. The user reports that she changes the clock manually. The problemsolving strategy used is first ask the user when her computer is off. The support person explained to the user that the BIOS keeps track of the time when the computer is turned off and the operating system keeps track of the time when the system is running. The support representative asked the user to keep track of when time lapse occurred. When the user reported that the time slippage occurred more often when the computer was turned on for long periods of time, the user support representative eliminated the BIOS battery as the cause. Because things such as office temperature can cause problems with the clock, the support representative suggested installing a program that would automatically update the time from the Internet. This solution worked and the user was happy.