hello friends! new(ish)!

Troubleshooting: Difference between revisions

From InstallGentoo Wiki v2
Jump to navigation Jump to search
>Ergopon
No edit summary
>Mrsnooze
No edit summary
Line 1: Line 1:
= Boot Failure =  
Troubleshooting is the process of finding the cause of a problem with the aim of fixing said problem.
[[File:Bootfail.png|thumb|Fix Boot]]
 
Once a problem can be defined well (e.g. an error code, a specific process to repeat the problem) then a solution is much more easily found through a search engine or much more easily described to others (e.g. bug reports).
 
Software/Hardware manuals often come with Troubleshooting Guides. These contain solutions to common problems caused by misconfiguration and should not be overlooked.
 
= Diagnosing a Problem =
== Divide and Conquer ==
The golden rule in finding the cause of a problem is to '''Divide and Conquer'''.
 
Ruling out as many possible causes of a problem will narrow your focus to where it's needed.
 
For example, let's say your computer is unexpectedly rebooting. No error message pops up and there are no warnings before it happens. It just happens, and it's a mystery. Begin by cutting down on the possibilities:
* Boot it up and don't run anything. Just let it sit there. Still rebooting?
** If not, there must be a piece of software which is triggering the reboot.
* Live boot a linux distro.
** If the problem goes away, it must be your operating system.
* Unplug your USB accessories, leaving just your keyboard and mouse.
** If the problem goes away, then it must be one of your USB accessories.
* Unplug extra HDDs and your optical drive.
* Remove your GPU and use onboard graphics.
* Remove everything but your mobo/cpu/ram, keyboard and monitor.
** Still happening? Try a different keyboard. Try a different monitor. Try each ram stick individually.
*** Still happening? Either a CPU or (much more likely) mobo fault.
 
At some point the problem will stop occuring, and then you will know what hardware/software is causing the problem.
 
== Sanity Checking ==
[[File:Bootfail.png|thumb|Boot Failure Flow Chart]]
A useful technique in troubleshooting is the sanity check: double checking really basic stuff that you've probably discounted without testing:
* Is it plugged in?
* Is it turned on?
* Are the drivers installed?
* Is it up to date?
* Can you ping google.com?
* RTFM!
 
Any anon worth their salt will have struggled with a problem for hours only to find something wasn't plugged in. Embarassing? Yes. But it happens to all of us.
 
== Reproducability ==
Reproducability is the ability to recreate the problem at will. Without being able to recreate the problem you cannot fully test your solution.
 
Also do you best to narrow down the steps required to reproduce a problem. Cutting down the process to produce the problem will tell you more about your problem and eliminate unrelated side steps.
 
== Intermittent Problems ==
Problems which occur infrequently are some of the hardest to define. If your computer reboots once a month and you're unable to reproduce the problem, it can be very tricky (and frustrating/fatiguing) to narrow down the cause of the problem.
 
What you can do:
* Log, log, log. Turn all system logging onto it's most verbose setting, so that when the problem finally does occur, you have as much information about it as possible.
* Document the experience. What were you running when it happened? What were you connected to? What was the room temperature? The Time? The date? What sounds did you hear? What else did you notice?
 
Keep this up, and eventually a pattern will emerge.
 
= Finding Solutions =
Once you have a decent idea to the cause of a problem it's time to find the solution. 99.9% of the time someone else has already had this problem before you have. 99% of the time the solution is waiting on the other side of a search engine.
 
== Basic Solution Finding ==
Use your search engine and give it the good stuff:
* Error codes.
* Error messages in quotes.
* Official Software names.
* Software versions.
* Official operating system names.
* Operating system versions.
* Hardware manufacturer names.
* Hardware model names.
 
Never forget that search engines do not speak english and have no idea what you're talking about. Search engines compare your keywords to their database. "Soft" words and phrases like "i have a problem with" will either fill your results with junk or give you no results.
 
Good searching:
* realtek dtv1000t "error 0x03b: tuner not ready"
* nvidia gtx970 antialias crash "far cry 4"
* linux mint 17.1 amd catalyst "won't compile"
* corsair 750w psu "high temperature"
 
Bad searching:
* why does windows crash when i click on the clock
* nvidia problem
* my computer crashes on tuesday and sometimes wednesday
* linux errors 2015
 
== Finding Problem Communities ==
If you can't find a solution to your problem, you can problably find other people with the same issue.
 
They may have other things to try which you haven't thought of, or may be searching for someone just like you who has a slightly different setup to help find the cause of the issue. You're likely to find links to official bugtrackers or manufacturer websites where the problem is acknowledged.
 
Participate and add your voice. Another "me too" comment with your specs and settings will not hurt.
 
You may also find out that the problem you're having is a known issue and unresolved (which tells you that you successfully diagnosed the problem and that there's no solution yet).
 
You may also find workarounds for your problem.
 
== Reporting a Unique Problem ==
If you're convinced you're experiencing a problem that nobody else has come across, report it! This is especially true if you're running current or beta software.
 
Find the official forum/bugtracker for the problem you're having, sign up, and post:
* Give your exact hardware/operating system/software versions.
* Detail the steps to reproduce the problem.
* Provide whatever logs you can, and use a service like pastebin to paste your logfile into, rather than making your bug report hard to read.
* Explain the steps you've taken to resolve the problem.
* Copy/paste links to anything helpful you've found in your troubleshooting.
* Post your gut feeling, if you have one, and explain that it's just a gut feeling. We all have them, and sometimes they're correct.
* Keep up with your post: With any luck you'll be asked some questions by others who want to help you. Just having other people say "me too!" will reassure you that you've done a good job in diagnosing the problem.
 
While this step doesn't immediately solve you problem, it does make you a key figure in resolving the problem, and for that you should feel pride.


[[Category:Tutorials]]
[[Category:Tutorials]]
[[Category:HowTo]]
[[Category:HowTo]]
[[Category:Software]]
[[Category:Hardware]]

Revision as of 11:55, 3 April 2015

Troubleshooting is the process of finding the cause of a problem with the aim of fixing said problem.

Once a problem can be defined well (e.g. an error code, a specific process to repeat the problem) then a solution is much more easily found through a search engine or much more easily described to others (e.g. bug reports).

Software/Hardware manuals often come with Troubleshooting Guides. These contain solutions to common problems caused by misconfiguration and should not be overlooked.

Diagnosing a Problem

Divide and Conquer

The golden rule in finding the cause of a problem is to Divide and Conquer.

Ruling out as many possible causes of a problem will narrow your focus to where it's needed.

For example, let's say your computer is unexpectedly rebooting. No error message pops up and there are no warnings before it happens. It just happens, and it's a mystery. Begin by cutting down on the possibilities:

  • Boot it up and don't run anything. Just let it sit there. Still rebooting?
    • If not, there must be a piece of software which is triggering the reboot.
  • Live boot a linux distro.
    • If the problem goes away, it must be your operating system.
  • Unplug your USB accessories, leaving just your keyboard and mouse.
    • If the problem goes away, then it must be one of your USB accessories.
  • Unplug extra HDDs and your optical drive.
  • Remove your GPU and use onboard graphics.
  • Remove everything but your mobo/cpu/ram, keyboard and monitor.
    • Still happening? Try a different keyboard. Try a different monitor. Try each ram stick individually.
      • Still happening? Either a CPU or (much more likely) mobo fault.

At some point the problem will stop occuring, and then you will know what hardware/software is causing the problem.

Sanity Checking

Boot Failure Flow Chart

A useful technique in troubleshooting is the sanity check: double checking really basic stuff that you've probably discounted without testing:

  • Is it plugged in?
  • Is it turned on?
  • Are the drivers installed?
  • Is it up to date?
  • Can you ping google.com?
  • RTFM!

Any anon worth their salt will have struggled with a problem for hours only to find something wasn't plugged in. Embarassing? Yes. But it happens to all of us.

Reproducability

Reproducability is the ability to recreate the problem at will. Without being able to recreate the problem you cannot fully test your solution.

Also do you best to narrow down the steps required to reproduce a problem. Cutting down the process to produce the problem will tell you more about your problem and eliminate unrelated side steps.

Intermittent Problems

Problems which occur infrequently are some of the hardest to define. If your computer reboots once a month and you're unable to reproduce the problem, it can be very tricky (and frustrating/fatiguing) to narrow down the cause of the problem.

What you can do:

  • Log, log, log. Turn all system logging onto it's most verbose setting, so that when the problem finally does occur, you have as much information about it as possible.
  • Document the experience. What were you running when it happened? What were you connected to? What was the room temperature? The Time? The date? What sounds did you hear? What else did you notice?

Keep this up, and eventually a pattern will emerge.

Finding Solutions

Once you have a decent idea to the cause of a problem it's time to find the solution. 99.9% of the time someone else has already had this problem before you have. 99% of the time the solution is waiting on the other side of a search engine.

Basic Solution Finding

Use your search engine and give it the good stuff:

  • Error codes.
  • Error messages in quotes.
  • Official Software names.
  • Software versions.
  • Official operating system names.
  • Operating system versions.
  • Hardware manufacturer names.
  • Hardware model names.

Never forget that search engines do not speak english and have no idea what you're talking about. Search engines compare your keywords to their database. "Soft" words and phrases like "i have a problem with" will either fill your results with junk or give you no results.

Good searching:

  • realtek dtv1000t "error 0x03b: tuner not ready"
  • nvidia gtx970 antialias crash "far cry 4"
  • linux mint 17.1 amd catalyst "won't compile"
  • corsair 750w psu "high temperature"

Bad searching:

  • why does windows crash when i click on the clock
  • nvidia problem
  • my computer crashes on tuesday and sometimes wednesday
  • linux errors 2015

Finding Problem Communities

If you can't find a solution to your problem, you can problably find other people with the same issue.

They may have other things to try which you haven't thought of, or may be searching for someone just like you who has a slightly different setup to help find the cause of the issue. You're likely to find links to official bugtrackers or manufacturer websites where the problem is acknowledged.

Participate and add your voice. Another "me too" comment with your specs and settings will not hurt.

You may also find out that the problem you're having is a known issue and unresolved (which tells you that you successfully diagnosed the problem and that there's no solution yet).

You may also find workarounds for your problem.

Reporting a Unique Problem

If you're convinced you're experiencing a problem that nobody else has come across, report it! This is especially true if you're running current or beta software.

Find the official forum/bugtracker for the problem you're having, sign up, and post:

  • Give your exact hardware/operating system/software versions.
  • Detail the steps to reproduce the problem.
  • Provide whatever logs you can, and use a service like pastebin to paste your logfile into, rather than making your bug report hard to read.
  • Explain the steps you've taken to resolve the problem.
  • Copy/paste links to anything helpful you've found in your troubleshooting.
  • Post your gut feeling, if you have one, and explain that it's just a gut feeling. We all have them, and sometimes they're correct.
  • Keep up with your post: With any luck you'll be asked some questions by others who want to help you. Just having other people say "me too!" will reassure you that you've done a good job in diagnosing the problem.

While this step doesn't immediately solve you problem, it does make you a key figure in resolving the problem, and for that you should feel pride.