Discussion:
[conspire] 3rd Master Hard Disk Error
p***@ieee.org
2018-11-18 19:21:55 UTC
Permalink
During POST and before GRUB, one of my computers gives the error message
    3rd Master Hard Disk Error
and says press F1 to continue.
Well this PC has 3 hard drives.  Which one is it?   Doing a web-search I find "information" in 2 categories:

* Access the BIOS to turn off the test.
* Replace "the hard drive".

At first I thought "3rd" meant a specific drive like /dev/hdc.   Apparently computers with only 1 drive generate the same message.

I have taken the precaution to make copies of important data on different physical drives.

What next?
Steve M Bibayoff
2018-11-19 00:31:37 UTC
Permalink
Hello,
Post by p***@ieee.org
During POST and before GRUB, one of my computers gives the error message
3rd Master Hard Disk Error
and says press F1 to continue.
...
Post by p***@ieee.org
What next?
Well, you could post what type of system you have. How many HD's? What
king are they (ide, sata, version)? How the HD's are connected? Same
bus? Different controller? You have a CD/DVD player? What does the
bios say what is connected where?

In my experience, this is usually either a sata controller issue, or a
problem with the circuit board on the HD itself (the platters would
report fine with whatever test you run).

Just a start,
Steve
Rick Moen
2018-11-19 01:24:11 UTC
Permalink
Post by Steve M Bibayoff
In my experience, this is usually either a sata controller issue, or a
problem with the circuit board on the HD itself (the platters would
report fine with whatever test you run).
It's a very good point that the error reported could be caused by HBA
circuitry problems. As before, logical thinking helps, e.g., after the
testing I recommended upthread one _thinks_ the problem has been
isolated to a specific hard drive -- but it's cheap insurance to then
cross-check to ensure that the cause isn't a cable or HBA defect, i.e.,
reconnect that drive to a different cable and HBA. Does the problem
move with the drive? If yes, it's the drive. If no, it's the HBA or
cable.
Rick Moen
2018-11-19 01:16:33 UTC
Permalink
Post by p***@ieee.org
I have taken the precaution to make copies of important data on different physical drives.
Absolutely the right first step.
Post by p***@ieee.org
What next?
I don't know what to make of your quoting the BIOS info screen as saying
'3rd Master Hard Disk Error, press F1 to continue', but then also
observing that computers with only 1 drive generate the same message.
That contradiction seems a bit mystifying.

However, one can always fall back on good ol' logic. I infer from this
machine having three physical hard drives that it's not a laptop, which
is good news because it means you can easily open up the case and do
hardware work. Now that you've secured backups, you could proceed like
this:

1. Disconnect your computer from power. Bring the system unit to
workspace with generous room and decent lighting. Bring out some
ramekins or small bowls to hold screws. Unscrew the case cover.
Perhaps with the aid of a flashlight, take a good long look at how the
motherboard and the three drives are connected.

Newer PCs will cross-connect to the motherboard's host bus adapter (HBA)
circuitry using SATA cabling. Older PCs will instead use PATA ribbon
cabling, popularly but vaguely called 'IDE'.

Typical SATA cable:
https://cs-electronics.com/product/sata-cable-7-pin-sata-connector-both-ends/

Typical PATA cable:
Loading Image...

The power connections I'm used to seeing are separate from the data
cables (mentioned above) and use the familiar 4-pin Molex plugs.
However, my understanding is that same SATA drives now have a newfangled
15-pin SATA-specific power attachment, and naturally there are converter
cables, e.g., picture here:
https://www.newegg.com/Product/Product.aspx?Item=N82E16812200061&Description=sata%20power%20cable&cm_re=sata_power_cable-_-12-200-061-_-Product

2. Maybe you should use a small flashlight to see any labelling on the
motherboard where the data cable from each hard drive connects. You
might see wording like 'Primary' / 'Secondary' in the case of PATA, or
'SATA1', 'SATA2', etc. in the case of SATA. Feel encouraged to take
notes about what's connected to what.

The basic scheme for SATA's a lot simpler than for PATA, reflecting
having learned from certain dismal aspects of PATA and happily leaving
them behind. A SATA connector goes from the HBA circuitry directly to
one (1) physical drive, period. Each connector w/cable drives (at most)
one SATA device. By contrast, PATA was designed to let the HBA
connector w/cable drive either one or two PATA physical drives (or, of
course, none). Thus, most PATA ribbon cables have -three- connectors,
so as to accommodate two drive devices on the 'PATA chain'. To make
this functionality work and the HBA circuitry be able to communicate
distinctly with the (up to) two remote drive devices, the drives had to
be physically configured to assert either that they were the 'master' or
'slave' device on the chain or were the only device, as appropriate.
This was usually done by setting a jumper on each drive, though
alternatively it could be done using a horror called 'cable select':
https://en.wikipedia.org/wiki/Parallel_ATA#Cable_select

If this is PATA, you'll need to understand the basics of the above,
because the next thing we're going to do is selectively disconnect
drives. OTOH, if it's SATA, you've lucked out, as then whole areas
of complication go away.

By the way, this would also be an excellent time to visit the BIOS Setup
screens and take down a note or two about whatever it says concerning
the hard drives. It's incredibly rare in modern motherboards for these
to matter a lot, but it couldn't hurt. At worst, this is precaution
against burning your bridges. At best, you might see something worth
pondering and investigating. (obviously, you'll need system power again
to look there. When you're done, disconnect the system unit from AC
power, again.)

3. Now that you've studied your system and taken some notes about the
three drives and their connection, it's time to figure out which one is
squawking. Disconnect from the backs of two of the hard drives their
power connectors. For this purpose, there's no need to disconnect the
data cables. (Therefore, don't do that, at this time, on KISS grounds.)

Which drive should remain connected to a power feed? It doesn't much
matter, but one might indulge a small burst of OCD and let it be
whatever's the 'first', drive, which is to say the drive on the SATA1
connector for SATA, or the 'master' drive on the 'primary' chain for
PATA.

As usual, PATA intoduces a pain-in-the-ass complication, here: If the
drive still connected to power has been part of a two-drive PATA chain,
then it was probably jumpered for that role (unless cable select was
employed, which please see). Thus, you'll probably need to rejumper it
for the single-drive role it's about to assume for testing purposes.
Do so.

No need to close up the case, but you can now reconnect power and go
through Power-On Self-Test (POST) again. Still see '[something] Hard Disk
Error, press F1 to continue'? Then, that's your problem drive. If not,
yank system power and try each of the other two drives as the sole drive
receiving power, the same way, to test them, until you find it. (It
doesn't matter whether anything can boot, because you're using POST
diagnostics.

And, of course, when you're done, but jumpers and power connectors back.

Also, it really couldn't hurt to gently reconnect and refasten all PATA
or SATA data connectors. Those coming loose could cause bewildering and
sometimes intermittent problems.


You know, by the way, Linux may have been muttering about read/write
problems in the system log files (usu. /var/log/messages), including
very specfic /dev/sdX citations. You could look. Small amount of stuff
about that here: https://www.suse.com/support/kb/doc/?id=7006510

(SCSI notation is used even for non-SCSI mass storage because all the
relevant drivers now leverage the SCSI layer. That's why you don't see
/dev/hdX device nodes any more, only /dev/sdX.)
p***@ieee.org
2018-11-19 06:21:26 UTC
Permalink
in /var/log/messages

* `grep -v scsi` returns nothing.

* Several pairs of messages like the following:

messages.1:Nov 11 22:26:15 PZ01 org.gtk.vfs.UDisks2VolumeMonitor[1283]: disc.c:352: error opening file BDMV/index.bdmv
messages.1:Nov 11 22:26:15 PZ01 org.gtk.vfs.UDisks2VolumeMonitor[1283]: disc.c:352: error opening file BDMV/BACKUP/index.bdmv

Devices are indeed listed as /dev/sda  /dev/sdb and /dev/sdc.
Sizes are "roughly"100 GB, 200DB and 900 GB, which gives some clue as to how old they are.

I'm thinking that once I get the box open, it might be time to buy a new TB drive,  I had been thinking it was time to reinstall the OS anyway.  If /usr/bin is on a drive with SSD buffer, everything will run much faster.
Post by p***@ieee.org
I have taken the precaution to make copies of important data on different physical drives.
Absolutely the right first step.
Post by p***@ieee.org
What next?
I don't know what to make of your quoting the BIOS info screen as saying
'3rd Master Hard Disk Error, press F1 to continue', but then also
observing that computers with only 1 drive generate the same message.
That contradiction seems a bit mystifying.

However, one can always fall back on good ol' logic.  I infer from this
machine having three physical hard drives that it's not a laptop, which
is good news because it means you can easily open up the case and do
hardware work.  Now that you've secured backups, you could proceed like
this:

1.  Disconnect your computer from power.  Bring the system unit to
workspace with generous room and decent lighting.  Bring out some
ramekins or small bowls to hold screws.  Unscrew the case cover.
Perhaps with the aid of a flashlight, take a good long look at how the
motherboard and the three drives are connected.

Newer PCs will cross-connect to the motherboard's host bus adapter (HBA)
circuitry using SATA cabling.  Older PCs will instead use PATA ribbon
cabling, popularly but vaguely called 'IDE'.

Typical SATA cable:
https://cs-electronics.com/product/sata-cable-7-pin-sata-connector-both-ends/

Typical PATA cable:
https://commons.wikimedia.org/wiki/File:PATA-cable.jpg

The power connections I'm used to seeing are separate from the data
cables (mentioned above) and use the familiar 4-pin Molex plugs.
However, my understanding is that same SATA drives now have a newfangled
15-pin SATA-specific power attachment, and naturally there are converter
cables, e.g., picture here:
https://www.newegg.com/Product/Product.aspx?Item=N82E16812200061&Description=sata%20power%20cable&cm_re=sata_power_cable-_-12-200-061-_-Product

2.  Maybe you should use a small flashlight to see any labelling on the
motherboard where the data cable from each hard drive connects.  You
might see wording like 'Primary' / 'Secondary' in the case of PATA, or
'SATA1', 'SATA2', etc. in the case of SATA.  Feel encouraged to take
notes about what's connected to what.

The basic scheme for SATA's a lot simpler than for PATA, reflecting
having learned from certain dismal aspects of PATA and happily leaving
them behind.  A SATA connector goes from the HBA circuitry directly to
one (1) physical drive, period.  Each connector w/cable drives (at most)
one SATA device.  By contrast, PATA was designed to let the HBA
connector w/cable drive either one or two PATA physical drives (or, of
course, none).  Thus, most PATA ribbon cables have -three- connectors,
so as to accommodate two drive devices on the 'PATA chain'.  To make
this functionality work and the HBA circuitry be able to communicate
distinctly with the (up to) two remote drive devices, the drives had to
be physically configured to assert either that they were the 'master' or
'slave' device on the chain or were the only device, as appropriate.
This was usually done by setting a jumper on each drive, though
alternatively it could be done using a horror called 'cable select':
https://en.wikipedia.org/wiki/Parallel_ATA#Cable_select

If this is PATA, you'll need to understand the basics of the above,
because the next thing we're going to do is selectively disconnect
drives.  OTOH, if it's SATA, you've lucked out, as then whole areas
of complication go away.

By the way, this would also be an excellent time to visit the BIOS Setup
screens and take down a note or two about whatever it says concerning
the hard drives.  It's incredibly rare in modern motherboards for these
to matter a lot, but it couldn't hurt.  At worst, this is precaution
against burning your bridges.  At best, you might see something worth
pondering and investigating.  (obviously, you'll need system power again
to look there.  When you're done, disconnect the system unit from AC
power, again.)

3.  Now that you've studied your system and taken some notes about the
three drives and their connection, it's time to figure out which one is
squawking.  Disconnect from the backs of two of the hard drives their
power connectors.  For this purpose, there's no need to disconnect the
data cables.  (Therefore, don't do that, at this time, on KISS grounds.)

Which drive should remain connected to a power feed?  It doesn't much
matter, but one might indulge a small burst of OCD and let it be
whatever's the 'first', drive, which is to say the drive on the SATA1
connector for SATA, or the 'master' drive on the 'primary' chain for
PATA.

As usual, PATA intoduces a pain-in-the-ass complication, here:  If the
drive still connected to power has been part of a two-drive PATA chain,
then it was probably jumpered for that role (unless cable select was
employed, which please see).  Thus, you'll probably need to rejumper it
for the single-drive role it's about to assume for testing purposes.
Do so.

No need to close up the case, but you can now reconnect power and go
through Power-On Self-Test (POST) again.  Still see '[something] Hard Disk
Error, press F1 to continue'?  Then, that's your problem drive.  If not,
yank system power and try each of the other two drives as the sole drive
receiving power, the same way, to test them, until you find it.  (It
doesn't matter whether anything can boot, because you're using POST
diagnostics.

And, of course, when you're done, but jumpers and power connectors back.

Also, it really couldn't hurt to gently reconnect and refasten all PATA
or SATA data connectors.  Those coming loose could cause bewildering and
sometimes intermittent problems.


You know, by the way, Linux may have been muttering about read/write
problems in the system log files (usu. /var/log/messages), including
very specfic /dev/sdX citations.  You could look.  Small amount of stuff
about that here:  https://www.suse.com/support/kb/doc/?id=7006510

(SCSI notation is used even for non-SCSI mass storage because all the
relevant drivers now leverage the SCSI layer.  That's why you don't see
/dev/hdX device nodes any more, only /dev/sdX.)
Rick Moen
2018-11-19 06:41:39 UTC
Permalink
Post by p***@ieee.org
in /var/log/messages
* `grep -v scsi` returns nothing.
messages.1:Nov 11 22:26:15 PZ01 org.gtk.vfs.UDisks2VolumeMonitor[1283]: disc.c:352: error opening file BDMV/index.bdmv
messages.1:Nov 11 22:26:15 PZ01 org.gtk.vfs.UDisks2VolumeMonitor[1283]: disc.c:352: error opening file BDMV/BACKUP/index.bdmv
Well, that _could_ be a sign of a failing drive, but my offhand
impression is that's really thin in the context of the BIOS raising an
alarm during POST. It doesn't seem to match.
Post by p***@ieee.org
Devices are indeed listed as /dev/sda  /dev/sdb and /dev/sdc.
Sizes are "roughly"100 GB, 200DB and 900 GB, which gives some clue as to how old they are.
I'm thinking that once I get the box open, it might be time to buy a new TB drive,  I had been thinking it was time to reinstall the OS anyway.  If /usr/bin is on a drive with SSD buffer, everything will run much faster.
If during diagnosis you isolate peoblem cause to a specific physical
drive device, don't hasten to throw it away without some further steps.
Specifically, each manufacturer (Samsung, Western Digital, Seagate,
etc.) offers for download diagnostic and repair software that in many
cases can massage an old drive into like-new condition. Look around and
find (and use) the one for your device. Example:
https://www.seagate.com/support/downloads/seatools/
p***@ieee.org
2018-11-19 17:52:55 UTC
Permalink
The discussion about disk drives caused me to think, Why don't I just replace the whole thing, especially with Black Friday approaching.   Which brings up some considerations that would be of importance to Linux install.

Graphics:   I recall fussing in the past with non-free drivers.   Do I want to avoid nVidia?  

Intel vs. AMD.   I happen to have machines with AMD.

number of cores/threads.   Despite many years of talk, is real parallel processing real?  I'm not into gaming. 
Rick Moen
2018-11-19 20:26:53 UTC
Permalink
Post by p***@ieee.org
The discussion about disk drives caused me to think, Why don't I just
replace the whole thing, especially with Black Friday approaching.
Yeah, you could do that cheaply. You _might_ end up with a system with
few or no Linux driver issues, but the odds are against you even if you
avoid Nvidia chipsets (which would be a good start).

Back in the latter 2000s, I did all evaluations for Cadence Design
System of proposed new systems (servers, desktops, and laptops) the firm
might wish to standardise on, for driver compatibility with Linux
(various RHEL and SLES) and Solaris. So, I got to play with a lot of
spanking-new computers, and could have told you a lot _at that time_
about what motherboard, etc. chipsets to avoid for reasonable compliance
with the Linux disto installers of the day. The specifics I knew are of
of course long past their expiration date. Some general guidelines are
timeless.

One: If you honestly want no hardware-support problems, avoid buying
anything with a chipet that hasn't been on the market at least a year.
The simplest way to ensure that is to buy a _system_ model that's been
offered for sale for at least a year. (Even then, there are occasional
gotchas where an OEM has slipstream-changed the constituent parts of a
PC system without bothering to change the model number. Dell is
particularly infamous for doing this.)

Two: In turn, the easiest way to ensure that the PC model you buy has
been on the market for at least a year is to buy used, not new. Novices
fear to do this because they think they need high-end spec hardware to
run Linux, an assumption that's hilarous to us old-fogies who know Linux
typically _lowers_ the hardware requirements compared to proprietary
OSes the hardware was built for.

Three: Yeah, try to stay away from Nvidia chipsets, because good
support for them tends to take a bit longer still.

Four: If you see Broadcom or Marvell components, e.g., ethernet or
wireless, there's a good chance you'll be hassled a bit, if only by the
need to acquire and install firmware BLOB files that the distros cannot
lawfully bundle into their installers.


Hey, here's an idea: You buy a brand-new system on Black Friday, and
then get frustrated by various dumb Linux driver problems or some new
equivalent of the infamous Intel Skylake instability problem of two
years ago that took a half a year and several kernel revisions to thrash
out. (https://mjg59.dreamwidth.org/41713.html) Then, a year later, I
buy that system off you, used, at half price. Deal? ;->
Deirdre Saoirse Moen
2018-11-19 20:33:03 UTC
Permalink
In contrast to what Rick says, a lot of Black Friday specials are older models. The catch is that many of them may be underpowered for Linux.

So, the drivers may be out there already, just make sure you're not hosing yourself on the RAM and/or drive space…in addition to the graphics drivers things. On the NVIDIA question, I'm personally not a fan, but you do you.

Deirdre
Rick Moen
2018-11-19 21:10:14 UTC
Permalink
Post by Deirdre Saoirse Moen
In contrast to what Rick says, a lot of Black Friday specials are
older models. The catch is that many of them may be underpowered for
Linux.
Deirdre had this information about the models typically offered on Black
Friday (of which I was unaware), so I asked if she'd kindly post it.
(Thanks!)

The key to figuring out if the system is going to be anaemic is
_research_. This is the one fatal error I find that people purchasing
new systems tend to make: buying something on the spot and taking it
home, rather than writing down the make/model and any applicable options
bundles of promising offerings, going home, and spending some time
looking up their particulars (from a Linux perspective) on the Internet.
One heuristic that's useful is to make a list of the constituent
chipsets (ethernet, sound, wireless, video, etc.) and look up each such
chip model + the word 'Linux' with a Web search engine.

The result may end up being the same purchase, but a forewarned one,
e.g., you know you can install your favourite distro except you'll want
it on wired ethernet for initial Internet access in order to fetch
post-installation the required firmware BLOB file for the damned
Broadcom wireless chip.

In general terms, RAM is the most vital thing. Lots of it, not of
inexcusably slow varieties, and preferably with the ability to expand to
a decently high amount. It's 2018, so 16GB of RAM probably should be
routine (disclaimer: I haven't looked, and my knowledge of the market is
out of date). If not, wonder why.

Again, I haven't checked the state of the market, but I'll guesstimate
that although DDR4 RAM is enticing, it's still breathtakingly expensive
compared to DDR3 -- leading to a futureproofing dilemma. (Amirite?)

All of the major manufacturers have product _series_ (lines) aimed at
the 'home' market, that characteristically are shoddy and limited, and
that in general I consider to be sucker bait. For example, HP has
Pavillion, Dell has Dimension, Inspiron, and XPS[1], Acer has Aspire,
Toshiba has Satellite, and Lenovo has IdeaPad. I would recommend
entirely avoiding those, in favour of 'business' product lines, as a
ground-floor filter against cheap schlock that will be somewhat
inadequate from the get-go. 'Business' product lines tend to cost a bit
more. You get better value over time because their economic service
life is substantially longer, plus they Suck Less[tm] during those
service lives.

Don't forget that a unit with one annoying driver problem might
nonetheless be fine _if_ it's modular in that area. The classic example
would be a laptop that comes bundled with a Linux-hostile wireless chip,
with the saving grace that the wireless functionality is on a mini-PCI
card that's easily removed and replaced with an equivalent cheap
aftermarket mini-PCI subboard using an Intel wireless chip.


[1] XPS are supposed to be higher-performance, thus the
fashionable-in-2000s-marketing 'X' thing (standing for Xtreme
Performance System), but they were still pretty shoddy, IMO, and also
Nvidia-heavy.
Post by Deirdre Saoirse Moen
So, the drivers may be out there already, just make sure you're not hosing yourself on the RAM and/or drive space…in addition to the graphics drivers things. On the NVIDIA question, I'm personally not a fan, but you do you.
Deirdre
_______________________________________________
conspire mailing list
http://linuxmafia.com/mailman/listinfo/conspire
Tony Godshall
2018-11-20 00:04:07 UTC
Permalink
Post by Rick Moen
The key to figuring out if the system is going to be anaemic is
_research_. This is the one fatal error I find that people purchasing
new systems tend to make: buying something on the spot and taking it
home, rather than writing down the make/model and any applicable options
bundles of promising offerings, going home, and spending some time
looking up their particulars (from a Linux perspective) on the Internet.
One heuristic that's useful is to make a list of the constituent
chipsets (ethernet, sound, wireless, video, etc.) and look up each such
chip model + the word 'Linux' with a Web search engine.
...

Alternatively, if a place has a good return policy, you can buy it,
try it with a LiveUSB/LiveCD version of your distro, and return
it if it doesn't easily work. If they ask for a reason, you can even put
a bug in whatever ear, enough people saying "doesn't work with my
favorite operating system" may eventually get vendors to improve
compatibility (but I wouldn't hold my breath)

But definitely, if you can afford it, support the vendors who sell
something you like *with Linux supported* and avoid paying for
a Windows license, directly or indirectly.

Tony
p***@ieee.org
2018-11-20 02:52:03 UTC
Permalink
When I had a Costco membership, they had a really good no hassle return policy. 

BTW,  I identified the drives without using a screwdriver.   After a couple of tries I found the key to access BIOS.  Now I have the exact models for 2 IDE drives and 1 SATA.  Still am not sure which is unhappy.

I'm not too keen on buy a "new" IDE drive.  Without opening the box, can someone tell me if I could add a 2nd SATA to the existing box.  Somehow I don't think SATA can be daisy-chained like 2 IDE drives.

Regarding computer power, it sure has changed.  My first machine was an S-100 with Z-80 processor, which i over-clocked to an amazing 5 MHz.  I think the first memory card might have been 32K.  Also 8" floppy disk drives.  It took about 2 minutes to boot CPM, launch Wordstar, and open a small text file.  Over time, processors went from 8-bit to 16-bit and finally 64-bit.  Clock speeds went from a few MHz to 100 MHz, ... but toppped out around 3GHz.  No significant increase in quite a few years.  More CPU power comes from more cores, but for the applications I run, I rarely see even 4 cores fully used.

More RAM is still a good thing because we keep using higher resolution pictures.

Also the folks are SeaGate, SanDisk and Western Digital keep improving storage.   In past, I might have taken 1000 3MB pictures in a year.  Now they are 16MBpictures,  And GoPro makes it easy to take 10000 images in a few hours.
Post by Rick Moen
The key to figuring out if the system is going to be anaemic is
_research_.  This is the one fatal error I find that people purchasing
new systems tend to make: buying something on the spot and taking it
home, rather than writing down the make/model and any applicable options
bundles of promising offerings, going home, and spending some time
looking up their particulars (from a Linux perspective) on the Internet.
One heuristic that's useful is to make a list of the constituent
chipsets (ethernet, sound, wireless, video, etc.) and look up each such
chip model + the word 'Linux' with a Web search engine.
...

Alternatively, if a place has a good return policy, you can buy it,
try it with a LiveUSB/LiveCD version of your distro, and return
it if it doesn't easily work.  If they ask for a reason, you can even put
a bug in whatever ear, enough people saying "doesn't work with my
favorite operating system" may eventually get vendors to improve
compatibility (but I wouldn't hold my breath)

But definitely, if you can afford it, support the vendors who sell
something you like *with Linux supported* and avoid paying for
a Windows license, directly or indirectly.

Tony
Rick Moen
2018-11-20 05:56:48 UTC
Permalink
Post by p***@ieee.org
When I had a Costco membership, they had a really good no hassle
return policy. 
And still do, by the way.
Post by p***@ieee.org
BTW,  I identified the drives without using a screwdriver.   After a
couple of tries I found the key to access BIOS.  Now I have the exact
models for 2 IDE drives and 1 SATA.  Still am not sure which is
unhappy.
I'm not too keen on buy a "new" IDE drive.  Without opening the box,
can someone tell me if I could add a 2nd SATA to the existing box. 
Somehow I don't think SATA can be daisy-chained like 2 IDE drives.
Yes to the former. No to the latter.

SATA cannot be daisy-chained. OTOH, the good news for you is that it's
extremely likely there are two SATA headers on the motherboard. (I'm
excluding as unlikely the possibility of SATA services being provided by
an add-in card such as a PCI one, but if that _is_ the case, then again
it's extremely likely that the card provides two SATA connectors.

As I said, SATA is blessedly simple compared to PATA (old 'IDE'[1]).
Post by p***@ieee.org
More RAM is still a good thing because we keep using higher resolution pictures.
There are other good and IMO more compelling reasons why more RAM is
still a good thing.

One reason is that the OS can always deploy additional RAM in ways that
improve perceived performance. In the case of Linux, any RAM not spoken
for in any other way goes to a disc cache, which greatly improves the
perceived speed of mass storage, and also reduces wear on mass-storage
devices.

Another reason is that, with an adequate amount of spare RAM, you can
use virtual-machine technology in one or more creative way that makes a
qualitative difference in your ability to use your system well. (I have
long held that VM technology ought to be used in preference to
dual-booting, for example, if there is adequate total RAM, except for a
few rare use-cases where running an OS in a VM layer causes problems
that prohibit fruitful use.


[1] Technically speaking, SATA and PATA are both implementations of IDE.
Hence referring to the older standard as 'IDE' as if SATA were not also
IDE is incorrect and potentially confusing.
Ivan Sergio Borgonovo
2018-11-20 10:37:30 UTC
Permalink
Post by Rick Moen
One: If you honestly want no hardware-support problems, avoid buying
anything with a chipet that hasn't been on the market at least a year.
The simplest way to ensure that is to buy a _system_ model that's been
offered for sale for at least a year. (Even then, there are occasional
gotchas where an OEM has slipstream-changed the constituent parts of a
PC system without bothering to change the model number. Dell is
particularly infamous for doing this.)
I tend to use computers till some of their parts start to be unreliable.
While it seems we still have support for trident video cards, with newer
video board I'm starting to experience dropped support from kernel/vendor/X.
Here 2 out of 3 desktop PC are having problems with newer kernel/X since
video board support has been somehow dropped/experienced some regression.

Few years ago I moved from proprietary ATI drivers to open and now on
one PC I had to stop upgrading the kernel and on another one I had to
put on hold X video drivers.

At least here in Italy used PC that come with a warranty are expensive
and used PC sold by private are way too old.

Even on a 6 years lifespan, 1 year missing support taken from the
beginning or from the end is a non negligible slice of their lifespan,
plus you've to consider installation, configuration and data transfer time.

A really big slice of economy now run on Linux. No one would be so crazy
to put on the market "common hardware" that can't run Linux and nowaday
chipset are highly coupled with the CPU (so coupled I'm not aware of any
chipset made by 3rd parties other than the CPU maker).

There hasn't been that much innovation in other parts.

Notable exceptions are: wifi for notebooks, video boards, ethernets.

Branded PCs with newer parts are available on the market few months
later than DYI parts.

If you're going to buy a used desktop/notebook there is not that much
choice on the performance side and the cheapo tend to be too old.

Servers tend to be supported from day one, you really have to be dumb to
put on the market a server that can't run Linux smoothly.

Of course newer software tend to be buggier.

I was used to take computers phased out from the companies I worked for,
cost couldn't get lower than 0 and I did know if they were in reasonably
good conditions to be worth a resurrection.

Unless you've friends willing to donate hardware on which you're going
to bet is going to last enough to pay for the cost of labour of setting
it up I prefer to buy new *well chosen hardware*.

If you're not doing something "special" the cost of hardware doesn't
stand up with the cost of your time, unless you're having fun.
--
Ivan Sergio Borgonovo
https://www.webthatworks.it https://www.borgonovo.net
Rick Moen
2018-11-20 11:19:04 UTC
Permalink
Post by Ivan Sergio Borgonovo
I tend to use computers till some of their parts start to be
unreliable.
Same here, though in many cases I've given them away before that.

Some of the difference in our perceptions may trace to just the fact
of Silicon Valley being atypical. At any given time, there's a great
deal of slightly used and very good hardware floating around (along with
a great deal that isn't good, of course). Machines two years old can,
if you poke around, be found in peak condition at high discount.

Speaking for myself, I also got really good at avoiding hardware (and
hardware components) that's likely to have either reliability or
software-support problems for a very long time, (mostly) unconsciously
implementing Moen's Law of Hardware.

http://linuxmafia.com/~rick/lexicon.html#moenslaw-hardware

Moen's Law of Hardware

After years of helping people with hapless computer-hardware woes,
especially trouble-prone categories such as Linux on laptops, exotic
peripheral interfaces, etc., it occurred to me to wonder why I never had
such problems. It was mainly because of instinctive avoidance of dodgy,
exotic, new, and/or badly designed components -- which happens to track
strongly with programmers' characteristic prejudices. There's a logic to
that, which may not be immediately apparent to many:

Drivers for hardware don't emerge like Athena from the head of Zeus:
Someone has to create them. Especially for open-source OSes such as
Linux, this involves a chipset being brought to market, for it to be out
long enough for coders to buy and start using it, and for them to (if
necessary, absent manufacturer cooperation) do the hard work of
reverse-engineering required to write and debug hardware support. Then,
the resulting code filters out to various OS distributions' next
releases, and thus eventually to users.

It follows that, if you blithely buy what's new and shiny, or so badly
designed or built that coders eschew it, or so exotic that coders tend
not to own it, it will probably have sucky software support, especially
in open source. (Proprietary drivers can be written under NDA, often
before the hardware's release, while manufacturer help is routinely
denied to the open source world.) Conversely, if you buy equipment
that's been out for a while, doesn't suffer the (e.g., Diamond
Multimedia) problem of chip-du-jour, is bog-standard and of good but not
exotically expensive quality, it will probably have stellar driver
quality, because coders who rely on that hardware will make sure of
that.

Thus, it's very common for slightly ageing but good-quality gear to
outperform and be more reliable than the latest gee-whiz equipment,
because of radically better software support — not to mention the price
advantage.

Ergo, in 1999, instead of buying a current-production laptop to run
Linux on, I bought, used, a Sony VAIO PCG-505TX, because I knew several
Linux kernel coders had been using those as primary machines.
Performance and stability have been exceptional.

More broadly, if you can identify the types of gear programmers would
favour — and avoid — you'll be ahead of the game. Coders would avoid
winmodems / winprinters, brand-new 3D video chipsets, cheesy and
unsupported SATA "fakeraid" chipsets, low-end scanners reached through
parallel ports ganged to ATAPI ganged to SCSI chipsets, cheap
multifunction scanner/printer/fax boxes, hopelessly proprietary USB aDSL
bridge cards, etc. They would favour parts of standard interface,
command-set, and chipset design and high enough quality that they might
be reused in multiple machines over a long service life.

That's a rather old lexicon-page entry, as witness the reference to
winmodems. However, I've found it to still voice general truth.
Post by Ivan Sergio Borgonovo
A really big slice of economy now run on Linux. No one would be so
crazy to put on the market "common hardware" that can't run Linux
and nowaday chipset are highly coupled with the CPU (so coupled I'm
not aware of any chipset made by 3rd parties other than the CPU
maker).
Well, I just mentioned about the very recent Intel chipset that was
quite terrible for Linux (instability) for the better part of a year
after it was already the basis for wildly popular PC models. That's
really not that uncommon among brand-new motherboard/CP chipsets, sadly.
Post by Ivan Sergio Borgonovo
Notable exceptions are: wifi for notebooks, video boards, ethernets.
Well, here's what happens quite a bit: Broadcom (say) introduces yet
another cheap ethernet chipset that is only a tiny bit different from
the prior one, and probably works with one of the existing Linux open
source drivers with little or no modifiction, except for one little
problem: It has a new PCI ID identifier, which means that kernel
autoprobing will not know what driver to modprobe for it. So, all those
customers buying it as new hardware for Linux will be mystified at the
apparently unsupported ethernet hardware. _Very_ determined users may
read a technical analysis or figure out the problem and patch the PCI
IDs database to compensate, but otherwise users will need to await a new
packaged kernel incorporating the PCI IDs update.
p***@ieee.org
2018-11-20 16:42:43 UTC
Permalink
I also use a computer until it becomes a problem.Meanwhile, I found two links regarding AMD and Linux.
Official AMD support for Linuxhttps://www.amd.com/en/support/kb/release-notes/rn-prorad-lin-18-30
And on debian.org
https://wiki.debian.org/AtiHowTo
There are many more pages.   So if I do go shopping, I would look for AMD processor.
Post by Ivan Sergio Borgonovo
I tend to use computers till some of their parts start to be
unreliable.
Same here, though in many cases I've given them away before that.

Some of the difference in our perceptions may trace to just the fact
of Silicon Valley being atypical.  At any given time, there's a great
deal of slightly used and very good hardware floating around (along with
a great deal that isn't good, of course).  Machines two years old can,
if you poke around, be found in peak condition at high discount.

Speaking for myself, I also got really good at avoiding hardware (and
hardware components) that's likely to have either reliability or
software-support problems for a very long time, (mostly) unconsciously
implementing Moen's Law of Hardware.

http://linuxmafia.com/~rick/lexicon.html#moenslaw-hardware

  Moen's Law of Hardware

  After years of helping people with hapless computer-hardware woes,
  especially trouble-prone categories such as Linux on laptops, exotic
  peripheral interfaces, etc., it occurred to me to wonder why I never had
  such problems. It was mainly because of instinctive avoidance of dodgy,
  exotic, new, and/or badly designed components -- which happens to track
  strongly with programmers' characteristic prejudices. There's a logic to
  that, which may not be immediately apparent to many:

  Drivers for hardware don't emerge like Athena from the head of Zeus:
  Someone has to create them. Especially for open-source OSes such as
  Linux, this involves a chipset being brought to market, for it to be out
  long enough for coders to buy and start using it, and for them to (if
  necessary, absent manufacturer cooperation) do the hard work of
  reverse-engineering required to write and debug hardware support. Then,
  the resulting code filters out to various OS distributions' next
  releases, and thus eventually to users.

  It follows that, if you blithely buy what's new and shiny, or so badly
  designed or built that coders eschew it, or so exotic that coders tend
  not to own it, it will probably have sucky software support, especially
  in open source. (Proprietary drivers can be written under NDA, often
  before the hardware's release, while manufacturer help is routinely
  denied to the open source world.) Conversely, if you buy equipment
  that's been out for a while, doesn't suffer the (e.g., Diamond
  Multimedia) problem of chip-du-jour, is bog-standard and of good but not
  exotically expensive quality, it will probably have stellar driver
  quality, because coders who rely on that hardware will make sure of
  that.

  Thus, it's very common for slightly ageing but good-quality gear to
  outperform and be more reliable than the latest gee-whiz equipment,
  because of radically better software support — not to mention the price
  advantage.

  Ergo, in 1999, instead of buying a current-production laptop to run
  Linux on, I bought, used, a Sony VAIO PCG-505TX, because I knew several
  Linux kernel coders had been using those as primary machines.
  Performance and stability have been exceptional.

  More broadly, if you can identify the types of gear programmers would
  favour — and avoid — you'll be ahead of the game. Coders would avoid
  winmodems / winprinters, brand-new 3D video chipsets, cheesy and
  unsupported SATA "fakeraid" chipsets, low-end scanners reached through
  parallel ports ganged to ATAPI ganged to SCSI chipsets, cheap
  multifunction scanner/printer/fax boxes, hopelessly proprietary USB aDSL
  bridge cards, etc. They would favour parts of standard interface,
  command-set, and chipset design and high enough quality that they might
  be reused in multiple machines over a long service life.

That's a rather old lexicon-page entry, as witness the reference to
winmodems.  However, I've found it to still voice general truth.
Post by Ivan Sergio Borgonovo
A really big slice of economy now run on Linux. No one would be so
crazy to put on the market "common hardware" that can't run Linux
and nowaday chipset are highly coupled with the CPU (so coupled I'm
not aware of any chipset made by 3rd parties other than the CPU
maker).
Well, I just mentioned about the very recent Intel chipset that was
quite terrible for Linux (instability) for the better part of a year
after it was already the basis for wildly popular PC models.  That's
really not that uncommon among brand-new motherboard/CP chipsets, sadly.
Post by Ivan Sergio Borgonovo
Notable exceptions are: wifi for notebooks, video boards, ethernets.
Well, here's what happens quite a bit:  Broadcom (say) introduces yet
another cheap ethernet chipset that is only a tiny bit different from
the prior one, and probably works with one of the existing Linux open
source drivers with little or no modifiction, except for one little
problem:  It has a new PCI ID identifier, which means that kernel
autoprobing will not know what driver to modprobe for it.  So, all those
customers buying it as new hardware for Linux will be mystified at the
apparently unsupported ethernet hardware.  _Very_ determined users may
read a technical analysis or figure out the problem and patch the PCI
IDs database to compensate, but otherwise users will need to await a new
packaged kernel incorporating the PCI IDs update.
Rick Moen
2018-11-20 18:32:52 UTC
Permalink
Post by p***@ieee.org
I also use a computer until it becomes a problem.Meanwhile, I found
two links regarding AMD and Linux.
Official AMD support for Linux
https://www.amd.com/en/support/kb/release-notes/rn-prorad-lin-18-30
And on debian.org
https://wiki.debian.org/AtiHowTo
There are many more pages.   So if I do go shopping, I would look for
AMD processor.
Goodness, I didn't know you were really concerned about whether AMD
x86_64 CPUs have Linux compatibility problems. They never have, which
really isn't surprising, since they invented the 64-bit x86 instruction
set (first implemented in the AMD Opteron, in 2003).

AMD released the spec in 1999, so that OS kernels could engineer support
for it in forthcoming hardware, and Linux was the first OS to do so,
starting with the 2.4 version in 2001, before the CPUs even hit the
market. (Engineers doing the port used simulators.)
p***@ieee.org
2018-11-20 19:11:48 UTC
Permalink
This links have details on AMD graphics.   Generally an AMD processor implies AMD graphics, so I don't have to look further to see that it is not the nBrand graphics.
  I also use a computer until it becomes a problem.Meanwhile, I found
  two links regarding AMD and Linux.
Official AMD support for Linux
https://www.amd.com/en/support/kb/release-notes/rn-prorad-lin-18-30
And on debian.org
https://wiki.debian.org/AtiHowTo
There are many more pages.   So if I do go shopping, I would look for AMD processor.
Goodness, I didn't know you were really concerned about whether AMD
x86_64 CPUs have Linux compatibility problems.  They never have, which
really isn't surprising, since they invented the 64-bit x86 instruction
set (first implemented in the AMD Opteron, in 2003).

AMD released the spec in 1999, so that OS kernels could engineer support
for it in forthcoming hardware, and Linux was the first OS to do so,
starting with the 2.4 version in 2001, before the CPUs even hit the
market.  (Engineers doing the port used simulators.)
Rick Moen
2018-11-20 20:17:30 UTC
Permalink
Th[ose] links have details on AMD graphics.   Generally an AMD
processor implies AMD graphics, so I don't have to look further to
see that it is not the nBrand graphics.
Oh, sorry, I didn't follow your links.

Yes, what you say seems entirely fine. Of course, even AMD GPUs are not
necessarily well supported in current Linux distro software at the time of
market introduction -- though they might be fully supported in the
latest upstream kernel and X.org releases. It certainly helps that AMD
assists the Linux and open source community.

As a current (?) example, see the Arch Linux comments about AMD 'Southern
Islands' (Radeon HD 7000 Series GPUs) and AMD 'Sea Islands' (Radeon HD
8000 Series GPUs), here:
https://wiki.archlinux.org/index.php/AMDGPU#Installation

I don't know how current that is, and am a little surprised at Arch
Linux support for those video chips being so fiddly. Maybe the page is
outdated (rare for the ArchLinux wiki), but it was current as of some
time in recent memory. Those GPUs series entered the market in 2012 and
2013 respectively, according to Wikipedia.
Ivan Sergio Borgonovo
2018-11-21 10:44:08 UTC
Permalink
This links have details on AMD graphics.   Generally an AMD processor
implies AMD graphics, so I don't have to look further to see that it is
not the nBrand graphics.
It depends if you're choosing a CPU with an integrated GPU or not.

AMD integrated GPU tend to outperform Intel integrated GPU... infact AMD
is selling its technology to help Intel integrate better GPU in their
CPU (yep).

There are only discrete nvidia GPU... in notebook you may have an intel
integrated GPU + a nvidia GPU for power management reasons... that has
been painful for Linux users.

If you're into "things that use GPUs" to do stuff... well... I bet you
know better than me. If you aren't, you shouldn't care.

If you're interested in high resolutions x many monitors you'd better go
for a discrete video board (not for performance reasons rather for
places where to put cables and memory).

Nvidia video boards can work with AMD CPU.

I used to chose nvidia for most of my workstations because they had the
best driver installer at that time and somehow they offered the best
bang for the buck.
Then AMD started to ship equally good, probably better installer and I
could find cheaper cards so I switched.

The fact AMD was more open source friendly helped develop open drivers
that extended the lifespan of my PCs when proprietary support was dropped.

I don't need fast video cards. I know nvidia has been for a long time
the best performer and now it seems AMD has returned to be competitive.

Now considering my needs for a workstation I'd chose AMD for both CPU
and GPU... (cheaper, more cores, Spectre impact).
--
Ivan Sergio Borgonovo
https://www.webthatworks.it https://www.borgonovo.net
Ivan Sergio Borgonovo
2018-11-21 10:45:33 UTC
Permalink
Post by Rick Moen
Post by Ivan Sergio Borgonovo
I tend to use computers till some of their parts start to be
unreliable.
Same here, though in many cases I've given them away before that.
Some of the difference in our perceptions may trace to just the fact
of Silicon Valley being atypical. At any given time, there's a great
deal of slightly used and very good hardware floating around (along with
a great deal that isn't good, of course). Machines two years old can,
if you poke around, be found in peak condition at high discount.
Probably.

Here there are companies that sell used hardware with warranty.
Full boxes or spare parts.
Their target market is probably other companies.
They are still on business (wondering why).
They are not cheap.

There are few "retail" shops selling used hardware with warranty.
They are not cheap. Very limited choice.

I could easily find full PCs for 50-150 Euros on websites "a la
Craiglist" without warranty.
Considering my average budget for a new full PC excluding monitors is
around 400 Euros, the savings don't cover the risk (work + something not
working + reduced lifespan).

If I was looking for something more expensive I'd probably looking for
something more performing and probably new.

If it is not fun I do value my time very much.

Somehow it is such a pity... it's a lot of avoidable pollution... but
there is no reasonable market here and somehow I can understand the
reason. You'd be a fool giving warranty to PCs without testing them and
testing them once they are out of the factory can be pretty expensive if
done right and in that segment of prices considering other costs you
incur in reselling used PC not worth.

And... about pollution... one factor to take into account is difference
in power consumption between old and new PCs... somehow across
"generations" there may not be appreciable difference but sometimes
there is a huge difference and even on a "home workstation" it adds up
to the TCO (especially in EU where I think electricity is a bit more
expensive than in US).
Post by Rick Moen
Post by Ivan Sergio Borgonovo
A really big slice of economy now run on Linux. No one would be so
crazy to put on the market "common hardware" that can't run Linux
and nowaday chipset are highly coupled with the CPU (so coupled I'm
not aware of any chipset made by 3rd parties other than the CPU
maker).
Well, I just mentioned about the very recent Intel chipset that was
quite terrible for Linux (instability) for the better part of a year
after it was already the basis for wildly popular PC models. That's
really not that uncommon among brand-new motherboard/CP chipsets, sadly.
Probably "consumer" hardware in one of the two categories you mentioned
in an earlier post: cheapo or exotic (look at how many features I can
choke on).
Post by Rick Moen
Post by Ivan Sergio Borgonovo
Notable exceptions are: wifi for notebooks, video boards, ethernets.
Well, here's what happens quite a bit: Broadcom (say) introduces yet
another cheap ethernet chipset that is only a tiny bit different from
the prior one, and probably works with one of the existing Linux open
source drivers with little or no modifiction, except for one little
problem: It has a new PCI ID identifier, which means that kernel
autoprobing will not know what driver to modprobe for it. So, all those
customers buying it as new hardware for Linux will be mystified at the
apparently unsupported ethernet hardware. _Very_ determined users may
read a technical analysis or figure out the problem and patch the PCI
IDs database to compensate, but otherwise users will need to await a new
packaged kernel incorporating the PCI IDs update.
Buy new reasonably balanced mid range, not cheapo not exotic mobo. Add
20 bucks of Intel network card

vs.

spend an hour diagnosing the problem

wifi can be more tricky since high performance wifi cards cost
definitively more than 20 bucks.
Furthermore a notebook is a mobile device... it has to be light etc...
if it is light it generally is hard to open an swap parts etc...


You've to reach compromises and be a bit careful but it's not as bad as
it used to be.

I think for example Steam contributed a bit in having more reasonable
"consumer" video drivers in Linux as well... and there is so much stuff
in the embedded world using wifi and Linux...

I think a good starting point is always realizing what you really want.
--
Ivan Sergio Borgonovo
https://www.webthatworks.it https://www.borgonovo.net
p***@ieee.org
2018-11-21 17:46:04 UTC
Permalink
Ivan,
Thank you for all of the information.
Once there were 2 "major" makers of graphics chips.  ATI and nVidia.  AMD bought ATI in 2006 and has been building on that technology since them.  Other recent emails discussed that AMD tries support Linux.
I have heard of others using laptops with both Intel and nVidia graphics and having to stop and switch between high speed graphics and low power consumption.  I am not a gamer, and don't need that hassle.
Regarding vintage hardware, I found that NewEgg sells IDE drives with the note that the manufacture no long supports it, but NewEgg has a warranty.  That ought to be better than Craigs List.  Now I know my problem is a SATA drive so I don't have to go that route.

So I am considering if I should replace the drive, or retire the whole machine in favor of a laptop that possibly has new CPU, etc. Probably not a good use of my time and $ to replace motherboard and RAM and drive(s).  So I should consider laptops this weekend. 



Paul
Post by Rick Moen
Post by Ivan Sergio Borgonovo
I tend to use computers till some of their parts start to be
unreliable.
Same here, though in many cases I've given them away before that.
Some of the difference in our perceptions may trace to just the fact
of Silicon Valley being atypical.  At any given time, there's a great
deal of slightly used and very good hardware floating around (along with
a great deal that isn't good, of course).  Machines two years old can,
if you poke around, be found in peak condition at high discount.
Probably.

Here there are companies that sell used hardware with warranty.
Full boxes or spare parts.
Their target market is probably other companies.
They are still on business (wondering why).
They are not cheap.

There are few "retail" shops selling used hardware with warranty.
They are not cheap. Very limited choice.

I could easily find full PCs for 50-150 Euros on websites "a la
Craiglist" without warranty.
Considering my average budget for a new full PC excluding monitors is
around 400 Euros, the savings don't cover the risk (work + something not
working + reduced lifespan).

If I was looking for something more expensive I'd probably looking for
something more performing and probably new.

If it is not fun I do value my time very much.

Somehow it is such a pity... it's a lot of avoidable pollution... but
there is no reasonable market here and somehow I can understand the
reason. You'd be a fool giving warranty to PCs without testing them and
testing them once they are out of the factory can be pretty expensive if
done right and in that segment of prices considering other costs you
incur in reselling used PC not worth.

And... about pollution... one factor to take into account is difference
in power consumption between old and new PCs... somehow across
"generations" there may not be appreciable difference but sometimes
there is a huge difference and even on a "home workstation" it adds up
to the TCO (especially in EU where I think electricity is a bit more
expensive than in US).
Post by Rick Moen
Post by Ivan Sergio Borgonovo
A really big slice of economy now run on Linux. No one would be so
crazy to put on the market "common hardware" that can't run Linux
and nowaday chipset are highly coupled with the CPU (so coupled I'm
not aware of any chipset made by 3rd parties other than the CPU
maker).
Well, I just mentioned about the very recent Intel chipset that was
quite terrible for Linux (instability) for the better part of a year
after it was already the basis for wildly popular PC models.  That's
really not that uncommon among brand-new motherboard/CP chipsets, sadly.
Probably "consumer" hardware in one of the two categories you mentioned
in an earlier post: cheapo or exotic (look at how many features I can
choke on).
Post by Rick Moen
Post by Ivan Sergio Borgonovo
Notable exceptions are: wifi for notebooks, video boards, ethernets.
Well, here's what happens quite a bit:  Broadcom (say) introduces yet
another cheap ethernet chipset that is only a tiny bit different from
the prior one, and probably works with one of the existing Linux open
source drivers with little or no modifiction, except for one little
problem:  It has a new PCI ID identifier, which means that kernel
autoprobing will not know what driver to modprobe for it.  So, all those
customers buying it as new hardware for Linux will be mystified at the
apparently unsupported ethernet hardware.  _Very_ determined users may
read a technical analysis or figure out the problem and patch the PCI
IDs database to compensate, but otherwise users will need to await a new
packaged kernel incorporating the PCI IDs update.
Buy new reasonably balanced mid range, not cheapo not exotic mobo. Add
20 bucks of Intel network card

vs.

spend an hour diagnosing the problem

wifi can be more tricky since high performance wifi cards cost
definitively more than 20 bucks.
Furthermore a notebook is a mobile device... it has to be light etc...
if it is light it generally is hard to open an swap parts etc...


You've to reach compromises and be a bit careful but it's not as bad as
it used to be.

I think for example Steam contributed a bit in having more reasonable
"consumer" video drivers in Linux as well... and there is so much stuff
in the embedded world using wifi and Linux...

I think a good starting point is always realizing what you really want.
--
Ivan Sergio Borgonovo
https://www.webthatworks.it https://www.borgonovo.net
Tony Godshall
2018-11-21 18:33:30 UTC
Permalink
Regarding vintage hardware, I found that NewEgg sells IDE drives with the note that the manufacture no long supports it, but NewEgg has a warranty. That ought to be better than Craigs List. Now I know my problem is a SATA drive so I don't have to go that route.
I would go with modern SSD and an IDE adapter (msata-ide) over buying
actual legacy IDE drives. The price per megabyte has fallen
dramatically, and you will like the performance improvement. Just
make sure you get SSD that are small enough for your legacy hosts's
BIOS and OS to handle. Luckily they are readiily available (spinning
media in those sizes, largely).

And backups, backups, backups.
Rick Moen
2018-11-21 20:47:20 UTC
Permalink
Post by Tony Godshall
I would go with modern SSD and an IDE adapter (msata-ide) over buying
actual legacy IDE drives.
There was no real question of Paul being in the market for a new PATA
('IDE') drive, even if the (allegedly) failing drive had been PATA,
which it wasn't. If that _had_ been the case, and he wanted to replace
the (allegedly) failing PATA drive, the logical replacement would have
been a second, large, modern SATA drive on the motherboard's second
SATA connector, not anything new including an mSATA-to-PATA adapter on
either of the two PATA connectors. Because KISS.

In addition, the guess that the drive is failing does not IMO seem
well-supported. The obvious thing Paul should try is the Seagate
Tools's so-called 'zero fill erase option', which I mentioned upthread.
This is, I gather, Seagate's current implementation for IDE of the
traditional low-level format routine.

I mention this because one should always attempt to recondition
allegedly failing hard drives with low-level formatting before giving up
on them, unless there is some separate reason for getting rid of them,
like they're ridiculously small and/or slow for the current year's
needs, or they keep emitting smoke, or they are now cruising at a
constant zero RPM.

But, as you say, I would definitely consider SSDs first if pondering a
drive replacement in preference to a new hard drive. The benefits are
ludicrously large.
p***@ieee.org
2018-11-22 01:05:54 UTC
Permalink
First,  I am thankful for all of the constructive comments. 

I did a little "window shopping" for components.  If I open my box and put in new motherboard, new CPU, RAM and storage, I can expect to spend $500 - 600.   That will get me more cores and more storage than I had before.

One area that really affects the price is storage. 
2T HDD  $504T HDD  $752T SSD  $250-350
I've seen the benefit of cache on a HDD, but I need some more reason for the delta dollars.
Now if instead I get a laptop, I get a new display, keyboard, power supply and case which I don't really need.
+++Now regarding "failing drive".
Booting Debian has become problematic.  /  is on the "bad drive".
On two different computers, the SeagateToolKit says "No device mounted"On both computers the WD Data Lifeguard recognizes all of the devices.  Even found SSD on USB port.
When I am 110% sure that I have absolutely all of the important data files safely copied, I will try more tests including write 0's.
Post by Tony Godshall
I would go with modern SSD and an IDE adapter (msata-ide) over buying
actual legacy IDE drives.
There was no real question of Paul being in the market for a new PATA
('IDE') drive, even if the (allegedly) failing drive had been PATA,
which it wasn't.  If that _had_ been the case, and he wanted to replace
the (allegedly) failing PATA drive, the logical replacement would have
been a second, large, modern SATA drive on the motherboard's second
SATA connector, not anything new including an mSATA-to-PATA adapter on
either of the two PATA connectors.  Because KISS.

In addition, the guess that the drive is failing does not IMO seem
well-supported.  The obvious thing Paul should try is the Seagate
Tools's so-called 'zero fill erase option', which I mentioned upthread.
This is, I gather, Seagate's current implementation for IDE of the
traditional low-level format routine.

I mention this because one should always attempt to recondition
allegedly failing hard drives with low-level formatting before giving up
on them, unless there is some separate reason for getting rid of them,
like they're ridiculously small and/or slow for the current year's
needs, or they keep emitting smoke, or they are now cruising at a
constant zero RPM.

But, as you say, I would definitely consider SSDs first if pondering a
drive replacement in preference to a new hard drive.  The benefits are
ludicrously large.
Ivan Sergio Borgonovo
2018-11-22 07:18:04 UTC
Permalink
First,  I am thankful for all of the constructive comments.
I did a little "window shopping" for components.  If I open my box and
put in new motherboard, new CPU, RAM and storage, I can expect to spend
$500 - 600.   That will get me more cores and more storage than I had
before.
That seems a pretty expensive box considering you were using a very old
one that probably means you didn't need a very fast PC.

Furthermore a case for a cheap box is a very small percentage of the
cost of the whole PC. With "slow" video board and "slow" CPU you don't
need a beefy power supply neither any special investment in cooling.

A new power supply is not a really bad idea if the old one is OLD.
It is going to be more reliable and probably more efficient.
One area that really affects the price is storage.
2T HDD  $50
4T HDD  $75
2T SSD  $250-350
I've seen the benefit of cache on a HDD, but I need some more reason for the delta dollars.
Do you do regular backups somewhere else? Even better, do you have a
NAS/server at home?
Get a single fast smaller SSD.
An SSD will make your PC fly.
Now if instead I get a laptop, I get a new display, keyboard, power
supply and case which I don't really need.
And a new UPS. A new display is cool for dual monitor, unless you
already have 2 monitors and many laptop can't handle more than an
additional one.

And you skip the work to assemble the PC and you'll generally have a
better warranty coverage.
--
Ivan Sergio Borgonovo
https://www.webthatworks.it https://www.borgonovo.net
Rick Moen
2018-11-22 09:42:44 UTC
Permalink
Quoting Ivan Sergio Borgonovo (***@webthatworks.it):

[snip a lot]
Post by Ivan Sergio Borgonovo
And a new UPS.
Rick's rant #327:

Why not a voltage regulator?

Buying a UPS puts about 70% of your purchase money into a huge lead-acid
battery that you'll need to replace about every five-six years, and
all it gets you is ability to bridge continuous operation across
relatively small power outages. Plus, with the right integration
software for your OS, you can get orderly shutdowns when the battery is
about to run out of power. Plus, there's a bit of voltage regulation
circuitry, but typically not as good as the circuitry in a dedicated
voltage regulator box.

So, wow, the battery's because... we don't yet have journaled
filesytems? I don't know about you, but for the past 15+ years, my
Linux systems have all had journaling, which means that when the power
goes down, sure, the system goes down, but when power's restored the
system comes back up undamaged.

Given that my systems have journaling, the threat model of concern isn't
so much blackouts as brownouts -- periods of low voltage and voltage
swings.

Which is what an outboard voltage regulator protects against.

People keep telling me I ought to have a big honkn' lead-acid battery,
i.e., a UPS, but they never seem to be able to tell me _why_ I should
want that. The cynic in me suspect this is because they really haven't
thought things through, and haven't considered perhaps that voltage
regulation without the big honkn' lead-acid battery might be what's
actually wanted.
p***@ieee.org
2018-11-22 17:43:43 UTC
Permalink
Thank you again for all of the helpful comments.  I'm going address different topics separately. 

Over the years, I have had a few situations when power was shutdown or lost.  Linux system always restarted.  Windows would complain at start up.  Typically the biggest issue was with OpenOffice / LibreOffice.  In my experience it recovered any files that were open, but may a few minutes of the last changes were lost.
Some years ago, I bought an AC voltage stabilizer.  Totally passive based on special resonant transformer. Outputs clean AC waveform despite variations of the input voltage or waveform.  It's as big as a desktop computer and very heavy.  For that reason, I never put it by my desk.   8 or 10 years back PG&E had an incident that sent a high voltage spike followed by a black out for half an hour.  Eventually they contributed to the replacement computer.  Come to think of it, one of my IDE drives was a survivor of that incident.

On Thursday, November 22, 2018, 1:44:56 AM PST, Rick Moen <***@linuxmafia.com> wrote:

Quoting Ivan Sergio Borgonovo (***@webthatworks.it):

[snip a lot]
Post by Ivan Sergio Borgonovo
And a new UPS.
Rick's rant #327:

Why not a voltage regulator?

Buying a UPS puts about 70% of your purchase money into a huge lead-acid
battery that you'll need to replace about every five-six years, and
all it gets you is ability to bridge continuous operation across
relatively small power outages.  Plus, with the right integration
software for your OS, you can get orderly shutdowns when the battery is
about to run out of power.  Plus, there's a bit of voltage regulation
circuitry, but typically not as good as the circuitry in a dedicated
voltage regulator box.

So, wow, the battery's because... we don't yet have journaled
filesytems?  I don't know about you, but for the past 15+ years, my
Linux systems have all had journaling, which means that when the power
goes down, sure, the system goes down, but when power's restored the
system comes back up undamaged.

Given that my systems have journaling, the threat model of concern isn't
so much blackouts as brownouts -- periods of low voltage and voltage
swings. 

Which is what an outboard voltage regulator protects against.

People keep telling me I ought to have a big honkn' lead-acid battery,
i.e., a UPS, but they never seem to be able to tell me  _why_ I should
want that.  The cynic in me suspect this is because they really haven't
thought things through, and haven't considered perhaps that voltage
regulation without the big honkn' lead-acid battery might be what's
actually wanted.
Rick Moen
2018-11-21 20:33:45 UTC
Permalink
Post by p***@ieee.org
Ivan,
Thank you for all of the information.
Once there were 2 "major" makers of graphics chips.  ATI and nVidia. 
AMD bought ATI in 2006 and has been building on that technology since
them.  Other recent emails discussed that AMD tries support Linux.
I'm pretty sure Ivan knows all about this (just sayin'). ;->
Post by p***@ieee.org
So I am considering if I should replace the drive, or retire the
whole machine in favor of a laptop that possibly has new CPU, etc.
If you have the dosh to lavish, migrating from an old workstation to a
new laptop has many advantages. I'll just reiterate that all the
pitfalls of new chipsets' Linux support _can_ apply, and in fact laptops
are rather notorious for them, particularly but not exclusively in the
area of wireless chip support.
Ivan Sergio Borgonovo
2018-11-22 00:07:11 UTC
Permalink
Post by Rick Moen
Post by p***@ieee.org
Ivan,
Thank you for all of the information.
Once there were 2 "major" makers of graphics chips.  ATI and nVidia.
AMD bought ATI in 2006 and has been building on that technology since
them.  Other recent emails discussed that AMD tries support Linux.
I'm pretty sure Ivan knows all about this (just sayin'). ;->
uh let's add Trident, Sis, Matrox (nice boards), Genoa, Paradise, Chips
and Technologies, Tseng, Cirrus, Via, S3, 3dfx, Oak, Number nine (nice
tech), Hercules (I can't believe I had one, high res, no colors)...

I even didn't have to check my "Programming guide to the EGA and VGA
Cards" to name them but I'm sure I'm missing some.
--
Ivan Sergio Borgonovo
https://www.webthatworks.it https://www.borgonovo.net
Rick Moen
2018-11-22 01:04:45 UTC
Permalink
Post by Ivan Sergio Borgonovo
uh let's add Trident, Sis, Matrox (nice boards), Genoa, Paradise,
Chips and Technologies, Tseng, Cirrus, Via, S3, 3dfx, Oak, Number
nine (nice tech), Hercules (I can't believe I had one, high res, no
colors)...
Hey, I _loved_ my Hercules graphics card. And the colour was fine:
your choice of crisp green or amber against glorious black.

(And Matrox was simply ideal for its time.)

Flight Simulator in amber against black was really weird, though -- or,
worse, with one of IBM's green monochrome monitors on account of their
long pixel hang times that made game-playing imaging a bit
hallucinogenic.
Ivan Sergio Borgonovo
2018-11-22 00:07:32 UTC
Permalink
Post by Rick Moen
Post by p***@ieee.org
Ivan,
Thank you for all of the information.
Once there were 2 "major" makers of graphics chips.  ATI and nVidia.
AMD bought ATI in 2006 and has been building on that technology since
them.  Other recent emails discussed that AMD tries support Linux.
I'm pretty sure Ivan knows all about this (just sayin'). ;->
uh let's add Trident, Sis, Matrox (nice boards), Genoa, Paradise, Chips
and Technologies, Tseng, Cirrus, Via, S3, 3dfx, Oak, Number nine (nice
tech), Hercules (I can't believe I had one, high res, no colors)...

I even didn't have to check my "Programming guide to the EGA and VGA
Cards" to name them but I'm sure I'm missing some.
--
Ivan Sergio Borgonovo
https://www.webthatworks.it https://www.borgonovo.net
Christian Einfeldt
2018-11-21 18:23:16 UTC
Permalink
There is always Zareason, of course, a local computer manufacturer which
only installs Linux on machines:

http://zareason.com/
Post by p***@ieee.org
The discussion about disk drives caused me to think, Why don't I just
replace the whole thing, especially with Black Friday approaching. Which
brings up some considerations that would be of importance to Linux install.
Graphics: I recall fussing in the past with non-free drivers. Do I
want to avoid nVidia?
Intel vs. AMD. I happen to have machines with AMD.
number of cores/threads. Despite many years of talk, is real parallel
processing real? I'm not into gaming.
_______________________________________________
conspire mailing list
http://linuxmafia.com/mailman/listinfo/conspire
--
Christian Einfeldt
p***@ieee.org
2018-11-21 07:04:03 UTC
Permalink
I have identified the problem.
First, I used the BIOS setup to find the models of the disk drives.  

IDE  Seagate 120GBIDE  WD        200 GB
SATA1    Seagate 1000GBSATA2  
SATA3    DVD drive.

Next, following Rick's suggestion, I went to Seagate and WD websites.  Each had tools for Windows, MAC and stand alone.  One company's standalone tool said it only worked if disk was formatted FAT32.

This time, dual boot was helpful.  I booted Win7 and downloaded SeagateTools, and Western Digital Life Guard. The windows install was uneventful.   I launched the Seagate Tools; nothing happened.   I launched the WD tools.  After just a few seconds there was a table of results.  It listed all of the drives, even a USB dongle.The old IDE drives pass.  The SATA 1GB Seagate failed.

Reallocated SectorCount Value 3, Threshold 36, Worst 3

From the many other entries I think possibly this means there are only 3 spare sectors left to re-allocate.
  in /var/log/messages
* `grep -v scsi` returns nothing.
messages.1:Nov 11 22:26:15 PZ01 org.gtk.vfs.UDisks2VolumeMonitor[1283]: disc.c:352: error opening file BDMV/index.bdmv
messages.1:Nov 11 22:26:15 PZ01 org.gtk.vfs.UDisks2VolumeMonitor[1283]: disc.c:352: error opening file BDMV/BACKUP/index.bdmv
Well, that _could_ be a sign of a failing drive, but my offhand
impression is that's really thin in the context of the BIOS raising an
alarm during POST.  It doesn't seem to match.
Devices are indeed listed as /dev/sda  /dev/sdb and /dev/sdc.
Sizes are "roughly"100 GB, 200DB and 900 GB, which gives some clue as to how old they are.
I'm thinking that once I get the box open, it might be time to buy a new TB drive,  I had been thinking it was time to reinstall the OS anyway.  If /usr/bin is on a drive with SSD buffer, everything will run much faster.
If during diagnosis you isolate peoblem cause to a specific physical
drive device, don't hasten to throw it away without some further steps.
Specifically, each manufacturer (Samsung, Western Digital, Seagate,
etc.) offers for download diagnostic and repair software that in many
cases can massage an old drive into like-new condition.  Look around and
find (and use) the one for your device.  Example:
https://www.seagate.com/support/downloads/seatools/
Rick Moen
2018-11-21 07:41:56 UTC
Permalink
Post by p***@ieee.org
Reallocated SectorCount Value 3, Threshold 36, Worst 3
From the many other entries I think possibly this means there are only
3 spare sectors left to re-allocate.
No. This is related to some hocus-pocus going on in the drive's
on-board electronics to swap in spare sectors. Quoting a small piece of
https://superuser.com/questions/384095/how-to-force-a-remap-of-sectors-reported-in-s-m-a-r-t-c5-current-pending-sector
:

Most modern drives contain a number of "spare" sectors (e.g. 1,024
spare sectors). If the drive recognizes a sector as bad, it will stop
using it. Any requests to read or write to that damaged sector will
transparently be redirected to a spare sector. This marking off of a bad
sector, and reallocating its data to a spare sector, is called a
Reallocation Event. And the total number of sectors that have been
reallocated (and so how many of your spare sectors have been used up) is
the Reallocated Sector Count.


But, that aside, you say the drive failed [something]. OK, and maybe
the remapping of those three failing sectors to spare sectors, behind
the scenes, is part of a slow loss of sectors that's ongoing and will
eventually exhaust all spare sectors, after which you'd start actually
losing data living on failing sectors. You didn't say what 'failed'
meant, nor -- the important number -- how many spare sectors remain.

Just as a guess, not having the data you (apparently) saw, I suspect
what you want to do is this:
http://knowledge.seagate.com/articles/en_US/FAQ/203931en
That might fix all current problems. (If not, the device can always
achieve its best and highest purpose as landfill.)
p***@ieee.org
2018-11-21 20:21:13 UTC
Permalink
In using the computer, I was not aware of any malfunctions.  What I saw were:* The message from the BIOS running POST* When running Win7, it started popping up a window saying a drive was failing and that I should run a backup.
The Seagate Tool did not run on the "bad" computer with Win7.   On a newer machine with Win10, it identified and tested a Toshiba drive.
I went back to the Seagate website, but could not find a different (older) version.  In the interest of science, I will try the USB version.

Going forward:  Win 7 kept nagging me to back up.  Well I have copied all of MY data files off the system.  Now suppose I did a windows backup, what would it get me?
- If I replace the harddrive, can I use the backup to install windows 7?
- What if I also replace the motherboard?  Pretty sure that won't work.
- So I replace hard drive and do a clean install of Debian, can I use the back up to run under a virtual drive?
- In short, is there anything useful I can do with the backup?
BTW, I do have the install CD, but not sure if I can re-use the serial number with new hardware.
Post by p***@ieee.org
Reallocated SectorCount Value 3, Threshold 36, Worst 3
From the many other entries I think possibly this means there are only
3 spare sectors left to re-allocate.
No.  This is related to some hocus-pocus going on in the drive's
on-board electronics to swap in spare sectors.  Quoting a small piece of
https://superuser.com/questions/384095/how-to-force-a-remap-of-sectors-reported-in-s-m-a-r-t-c5-current-pending-sector
:

  Most modern drives contain a number of "spare" sectors (e.g. 1,024
  spare sectors). If the drive recognizes a sector as bad, it will stop
  using it. Any requests to read or write to that damaged sector will
  transparently be redirected to a spare sector. This marking off of a bad
  sector, and reallocating its data to a spare sector, is called a
  Reallocation Event. And the total number of sectors that have been
  reallocated (and so how many of your spare sectors have been used up) is
  the Reallocated Sector Count.


But, that aside, you say the drive failed [something].  OK, and maybe
the remapping of those three failing sectors to spare sectors, behind
the scenes, is part of a slow loss of sectors that's ongoing and will
eventually exhaust all spare sectors, after which you'd start actually
losing data living on failing sectors.  You didn't say what 'failed'
meant, nor -- the important number -- how many spare sectors remain.

Just as a guess, not having the data you (apparently) saw, I suspect
what you want to do is this:
http://knowledge.seagate.com/articles/en_US/FAQ/203931en
That might fix all current problems.  (If not, the device can always
achieve its best and highest purpose as landfill.)
Rick Moen
2018-11-22 03:32:41 UTC
Permalink
In using the computer, I was not aware of any malfunctions. 
With the exception (you mention elsethread) that booting Debian has
'become problematic', because its root partition is on the affected hard
drive.
What I saw were:* The message from the BIOS running POST* When running
Win7, it started popping up a window saying a drive was failing and
that I should run a backup.
Legitimate cause for concern, and you absolutely did the right thing by
setting backup as your first priority.
The Seagate Tool did not run on the "bad" computer with Win7.
Um,... OK? I'm not totally sure -- actually, a bit mystified -- what
'did not run' means. I see that there's a SeaTools for Windows whose
compatibility is supposed to include MS-Windows 7, SeaTools for DOS, an
intriguing 'SeaTools Bootable' that is self-hosting and you copy somehow
to a USB flash drive prior to use, and some 'SeaTools Legacy Tools'
files, comprising v1.12 that is recommended for use if 'you have system
compatibility problems with the v2 GUI version', apparently running in
good ol' text mode, and 'v.2.23 (graphical)'.

I gather that current (non-'legacy') releases of SeaTools appear to
include CLI and graphical versions of the tools. You should carefully
read program information and documentation, including any information
the drive manufacturer may (or may not) publish about what versions of
their diagnostic software are required for specific hard drive models
they have released in the past.
On a newer machine with Win10, it identified and tested a Toshiba drive.
Um, huh?

It would be a bad idea on general principle to have a Seagate utility
play around with a Toshiba drive. I would not have done that. If it
offered to test a Toshiba drive (let alone repair it), I would have
acted to prevent that operation. IMO, you should have wanted to find,
for a Seagate drive deemed suspect, the correct Seagate utility to
diagnose and possibly fix it.
I went back to the Seagate website, but could not find a different (older) version. 
Um... I don't know what 'older' means, here. Older than what? Why?

I'm really not clear on what you're doing and why, but FWIW the direct
link for 'SeaTools Legacy Tools' is
https://www.seagate.com/support/downloads/seatools/seatools-legacy-support-master/
--- but I'm confused by the notion that you couldn't find that (if
that's what you were trying to find).

I'm concerned that I'm unclear on exactly what you're trying to do,
here, and why. Perhaps you should back up and, first of all, determine
the exact model of the suspect Seagate drive. (Among other places, that
will almost certainly be shown in your BIOS Setup screens, in addition
to briefly during each Power-On Self Test.

Try entering that model number plus, oh, I don't, maybe 'diagnostics'
into a Web search engine -- with the aim of attempting to find out from
the Internet what Seagate-offered diagnostic software is (or was)
available for it.

I don't want to rub salt into your wounds, but around the time you first
acquired that hardware would have been an _excellent_ time to download
and burn to a CD/DVD all of the then-available diagnostic and repair
software for all the parts in your PC, most particularly the hard
drives. As time passes, one might well expect the manufacturers of the
PC's various subassemblies to _cease_ offering software related to old
and EOLed product.
In the interest of science, I will try the USB version.
If you mean the self-hosted 'SeaTools Bootable' tool, that sounds like a
fine idea.
Going forward:  Win 7 kept nagging me to back up.  Well I have copied
all of MY data files off the system.  Now suppose I did a windows
backup, what would it get me?
- If I replace the harddrive, can I use the backup to install windows 7?
- What if I also replace the motherboard?  Pretty sure that won't work.
- So I replace hard drive and do a clean install of Debian, can I
use the back up to run under a virtual drive?
- In short, is there anything useful I can do with the backup?
Well, this is an awkward bit, because I need to start with a couple of
disclaimers:

1. I have no firm grasp of what you are specifically referring to
when you say 'a Windows backup'. This is no doubt in part because

2. I'm pretty certain I've never even seen MS-Windows 7. So,
it follows fairly naturally that I have no idea what the
characteristics, capabilities, and limitations are of the bundled backup
utility. I actually didn't know they currently bundled one.
Historically, the bundled backup utilities with NT-based releases of
MS-Windows have been so dismal as to be pretty much universally ignored,
if memory serves.

However, I suspect you're asking the wrong question. I believe the
question you meant to ask is 'If I replace the believed-to-be-failing
Seagate hard drive, how do I carry forward to any replacement drive
MS-Windows 7, which is installed onto the believed-to-be-failing Seagate
drive?'
BTW, I do have the install CD, but not sure if I can re-use the serial
number with new hardware.
Yeah, about that:

You're not going to want to hear this (but -- disclaimer -- I lack
modern expertise on this matter because I've carefully opted out of
MS-Windows for decades). tl;dr: Microsoft Corporation has screwed you
and countless other users of its OS as to ability to reinstall the
operating system you paid for.


As you didn't clarify, I'm going to hazard a guess that your situataion
is that of a typical modern MS-Windows user:

1. You did not purchase a retail copy of MS-Windows 7. All you have is
an 'OEM preload' installation that arrived already installed on your
PC's hard drive, ready to use after booting the PC and entering your
name and some other details. The cost of the OEM preload was bundled
into the system price of the PC.

2. You do not possess a general-purpose installation image useful to
reinstall MS-Windows 7 from scratch, because none was provided to you,
and you probably didn't even stop to realise that this would be a big
long-term problem. Instead, you have one of two inadequate substitutes.
(a) You might have a 'recovery' DVD/CD. But probably not, because
burning one would have cost the PC manufacturer a whole 50 cents.
Instead, (b) you probably have a 'recovery partition' on the same hard
drive where the runtime preload lives.

If memory serves, there is a fairly easy way to create an optical disc
image housing the contents of the 'recovery partition', so that you have
at least that half-measure reinstallation software on something tht
won't go 'Pfft!' if the hard drive fails. Of course, the time and
trouble and (tiny) expense of burning the DVD is on you, and only an
insignificant percentage of MS-Windows users think to do so -- before
the hard drive fails or becomes unreliable, at which point, oops, it's
too late.

What is a 'recovery' image or partition? This is a deliberately
reduced-functionality MS-Windows installer that offers no installation
options, e.g., doesn't permit you to state how to partition the target
hard drive and which existing partitions to leave alone. Instead, IIRC,
it blows away 100% of the existing contents and all filesystems on the
target drive and reconstructs the hard drive contents exactly the way
the hard drive was partitioned and loaded by the OEM. This means the
'recovery' installer will blow away alternative-OS contents on that
drive, and will reinstall MS-Windows 7 plus all of the 'ratware'
third-party junk and advertising that the OEM accepted money to throw
into the bundle.


More of what you really didn't want to hear: When your PC was
brand-new, it was strongly in your interest to stop and think: 'Where's
my off-system means of reinstalling the operating system and (any)
bundled software? I.e., where's my master installation copy of the
software I'm paying for?' Any time you pay for proprietary software,
you're supposed to receive a reinstallable master copy, either on an
optical disc (sometimes with an activation code), or possibly as a
downloadable installable set of file (sometimes with an activation
code). This being something you paid to acquire, and knowing that
accidents happen to computers (including but not limited to failing hard
drives and malware), you make sure you have the complete means to
reinstall that software from scratch tucked away somewhere off-system,
right? Like on a DVD scrawled with the activation code (if any) in
Sharpie on the front?

So, where's your offsystem installable copy of MS-Windows 7, the OS
you paid for as part of your once-new PC? What is your plan of action
when, not if, the hard drive it's OEM-installed onto fails?

It boggles me, still, to this day, but your average MS-Windows user
never planned for this -- and so is completely unaware that he/she has
been totally screwed over.

Maybe I'm wrong in the above guess. Maybe you, Paul Zander, have a
proper retail copy of MS-Windows 7, acquired separately from your PC,
with required activation code, sitting in a ziplock bag in your office.
But the smart money's on 'Gee, I don't have that.'

If the latter, what do you do? Personally, I am firmly of the view
that, if you don't have a fully installable master copy of your software
(or at least the ability to acquire one quickly), then you don't
_really_ own it, because you're one hardware fault away from losing it
completely.

As I see it:

1. You could go buy MS-Windows 7. Again. And maybe make sure it isn't
a restricted 'recovery disc' or such.

2. You could decide you've had enough of awful compromises and abusive
customer relationships, and go open source. (I said farewell to
Microsoft operating systems on my computers for good around 1992.
For me, the last straw was when MS-Windows for Workgroups 3.11
hard-froze while I was copy-editing articles for _Blue Notes_ magazine,
the 40 page monthly newsletter of San Francisco PC User Group, My
copy-editing session were a pair of DOS sessions running shareware ASCII
editor QEdit on several files that I had been frequently updating to a
floppy disk. However, upon reboot, I found that not only had I lost the
contents of the QEdit buffers in RAM (no big loss), but also that
Windows for Workgroups had rewritten the edited files on-disck to zero
length, destroying hours of saved work.

That was it. I was done. That machine got reloaded with the beta of
OS/2 2.0 the same day, and then converted over to Linux a couple of
years later.

3. Work with the terrible hand you've been dealt.

If you can still boot that hard drive (which I believe you said you
can?), then perhaps you can (at least) create a 'recovery disc' on
optical media. That sucks, but it's way better than nothing. Or
perhaps you already have such an optical disc. I wasn't clear on that
from your very fragmentary description.

If your PC came with a 'recovery partition', then it's probably fairly
easy to poke around and figure out how to use it. Usually, this
involves pressing F8 or something like that during POST, bringing up a
menu of bootable targets. One is probably labelled something like
'Repair Your Computer' or something of that sort. If so, that's the
recovery partition. I would guess that it's perfeclty safe to boot that
partition to look around, i.e., I would guess that you are asked to give
confirmation before the 'recovery' installer blows away everything and
overwrites the hard drive. Suggest you explore that.

Inside MS-Windows 7 itself, I see alleged on the Web that there's a
built-in utility (Start menu, Back up and Restore, Create a System
Image) where you can request that the contents of the recovery partition
be burned / copied to somewhere.

Once you have, say, your 'recovery' image stored bootably on an optical
disc or a spare external hard drive, or someting like that, you can
experiment with it to see if you can trick it into doing something
useful. Like, for example, you could create a virtual machine inside
VirtualBox, then make the VM boot the 'recovery' installer, and let it
do a 'recovery' installation entirely within the VM -- writing only to
the VM's virtual disk file, not to the host OS's hard drive(s). The
beauty of VM technology is that all software running there gets lied to
and told 'No, you have full control of an actual real computer. You're
not running in a simulation. Trust me[tm].' Such a Win7 installation
would then doubtless squawk about needing 'product activation'. So, I
guess at that point you telephone Microsoft's telephone line for
activation, say you've moved Win7 into a VM, and badger them into
helping you. Scuttlebutt says they're not unreasonable about this.

(This was one of your questions above, so this is my verbose way of
saying 'Yeah, at least one variant on your idea might well be a plan.')

Of course, in the alternative, maybe you imagine that you ought to
somehow back up the current exact state of your MS-Windows 7
installation including installed applications and somewhat ratty and
worse-for-the-wear Registry -- and then do a corresponding restore
operation later. I personally think this is a tactical error, for
multiple reasons including MS-Window's tendency to accumulate bobbles
over time that are best dealt with by (infrequent) from-scratch
reinstallation of the OS, then from-scratch reinstallation of
applications, then re-creation of application configuration, then
restoration of user data files.

But, if you're feeling lucky and prefer to back up and later restore the
current exact state of your MS-Windows 7 installation including
installed applications, I'm sure you can somehow do that, possibly with
bundled backup/restore software, possibly with third-party software from
any of the cheerful publishers of proprietary MS-Windows utilities
standing by happy to accept your money.

Or, third alternative, there are ways to create _directly_ a virtual
disk image of your exact, literal, MS-Windows partition, for use under
VMware or VirtualBox. This notion has been the subject of discussion
within the year on this mailing list, so, if interested, let me know and
I can dredge up Mailman archive links to the relevant back postings.
(Or, you know, you could research that yourself. I don't have any magic
for finding such things.)


Disclaimer: Although I'm confident what I say in relation to Windows 7
is correct in broad outline at least, you are solemnly advised to verify
particulars before follwing the advise of someone who gave MS-Windows
the heave-ho in disgust more than a quarter-century ago.


(A propos of nothing in particular, I'm pleased to note that QEdit's
current incarnation as The SemWare Editor = TSE is still around for
MS-DOS, OS/2, and MS-Windows users. It was nice.)
p***@ieee.org
2018-11-22 18:25:16 UTC
Permalink
[Discussion on disk drives and diagnostics.]

Rick's most valuable suggest is to install and test diagnostics BEFORE there is a problem.  I admit I got into a thrashing mode with doing backups and trying to remember if there was something important in a different partition and not being meticulous with the testing.
So what I can report is that the WD DataLifeguard installed and runs on Win7 and Win8.  It recognizes all hard drives, even USB drives.  For any drive it can report the S.M.A.R.T. results.  It also has a list from basic test to fill drive with 0's.  Rick cautioned against doing tests on drives not WD.

Now I know that my problem is that the Seagate SATA drive has problems with reallocating sectors.  This drive has my Debian root partition.

My Windoz partions are not on this drive, so I can use the alternate OS and ignore the warnings about failing drive.  More discussion on a different thread. 

Meanwhile at the Seagate web-site offers a tool for windows. On both Win7 and Win8 it starts, runs for a while and reports "no devices mounted."  The Win7 system has a SG IDE and SG SATA drive. The Seagate web site has a search, but it wants the SERIAL number.   I think the WD tool might show serial number. 

Seagate also has a USB tool.  I was busy using USB copying files to SDD. That done, I have the bandwidth to read the documentation more carefully and also try the SG USB tool.

To answer a related question.  No I do not have a back-up server.  To be practical, I really need to pull cables through the house so it doesn't need to be at the same desk.  Despite all of the hype, WiFi doesn't work when there are multiple walls and closets. 

BTW, is it necessary to have the French Police siren?
  In using the computer, I was not aware of any malfunctions. 
With the exception (you mention elsethread) that booting Debian has
'become problematic', because its root partition is on the affected hard
drive.
What I saw were:* The message from the BIOS running POST* When running
Win7, it started popping up a window saying a drive was failing and
that I should run a backup.
Legitimate cause for concern, and you absolutely did the right thing by
setting backup as your first priority.
The Seagate Tool did not run on the "bad" computer with Win7.
Um,... OK?  I'm not totally sure -- actually, a bit mystified -- what
'did not run' means.  I see that there's a SeaTools for Windows whose
compatibility is supposed to include MS-Windows 7, SeaTools for DOS, an
intriguing 'SeaTools Bootable' that is self-hosting and you copy somehow
to a USB flash drive prior to use, and some 'SeaTools Legacy Tools'
files, comprising v1.12 that is  recommended for use if 'you have system
compatibility problems with the v2 GUI version', apparently running in
good ol' text mode, and 'v.2.23 (graphical)'.

I gather that current (non-'legacy') releases of SeaTools appear to
include CLI and graphical versions of the tools.  You should carefully
read program information and documentation, including any information
the drive manufacturer may (or may not) publish about what versions of
their diagnostic software are required for specific hard drive models
they have released in the past.
On a newer machine with Win10, it identified and tested a Toshiba drive.
Um, huh?

It would be a bad idea on general principle to have a Seagate utility
play around with a Toshiba drive.  I would not have done that.  If it
offered to test a Toshiba drive (let alone repair it), I would have
acted to prevent that operation.  IMO, you should have wanted to find,
for a Seagate drive deemed suspect, the correct Seagate utility to
diagnose and possibly fix it.
I went back to the Seagate website, but could not find a different (older) version. 
Um... I don't know what 'older' means, here.  Older than what?  Why?

I'm really not clear on what you're doing and why, but FWIW the direct
link for 'SeaTools Legacy Tools' is
https://www.seagate.com/support/downloads/seatools/seatools-legacy-support-master/
--- but I'm confused by the notion that you couldn't find that (if
that's what you were trying to find).

I'm concerned that I'm unclear on exactly what you're trying to do,
here, and why.  Perhaps you should back up and, first of all, determine
the exact model of the suspect Seagate drive.  (Among other places, that
will almost certainly be shown in your BIOS Setup screens, in addition
to briefly during each Power-On Self Test.

Try entering that model number plus, oh, I don't, maybe 'diagnostics'
into a Web search engine -- with the aim of attempting to find out from
the Internet what Seagate-offered diagnostic software is (or was)
available for it. 

I don't want to rub salt into your wounds, but around the time you first
acquired that hardware would have been an _excellent_ time to download
and burn to a CD/DVD all of the then-available diagnostic and repair
software for all the parts in your PC, most particularly the hard
drives.  As time passes, one might well expect the manufacturers of the
PC's various subassemblies to _cease_ offering software related to old
and EOLed product.
In the interest of science, I will try the USB version.
If you mean the self-hosted 'SeaTools Bootable' tool, that sounds like a
fine idea.
Rick Moen
2018-11-28 10:01:44 UTC
Permalink
To answer a related question.  No I do not have a back-up server.  To
be practical, I really need to pull cables through the house so it
doesn't need to be at the same desk.  Despite all of the hype, WiFi
doesn't work when there are multiple walls and closets. 
Honestly, the low-hanging fruit is to use a storage target that can be
brought physically to your computer for as long as required to write a
backup set to it, then detached and stored somewhere remote from your
computer (so that, e.g., the same burglar or house fire or flooding
incident cannot get both).

Something as simple as a big hard drive in an external case that's
connectable to your system using USB or eSATA or Thunderbolt connectors
would more than suffice.
BTW, is it necessary to have the French Police siren?
Until I've migrated my server functionality off that machine, I am
_really not_ going to futz with it. I'm not even going to reboot it to
the BIOS Setup to see if there's a 'disable audio alarm if there's a
failed CPU' toggle. It's just not worth the risk of making a creaky old
server box fail.

I recommend whistling a cheerful tune. ;->
p***@ieee.org
2018-11-22 20:39:38 UTC
Permalink
The air is clear, and my data has been backed up, so I can re-think my situation.  The only "urgency" is if a Black Friday special is something to consider.





I really preferLinux, but there are a handful programs that I want/need that only onWindows: tax software, some engineering packages. Nothing thatneeds the latest version of Windows. Generally I focus on that oneactivity.

I’ve used WINE,but it hasn’t been reliable. For example, upgrading Debian seemsto break it.

Quite a while ago,we decided the choices for multiple OS are:

1 Install Win 7 orolder. Then install Debian. At the end, the installer sees windowsand installs dual boot.

2 Win 8 and neweruse UEFI. Not compatible with GRUB. What I did was get big USBdrive to install Debian. Fuss with BIOS to allow enable Legacy Boot. I still have to press F12 to manually select the loader.

3 IF you haveinstall media for Win, first install Linux, then virtualizationsoftware, and run Windows in a virtual machine. Rick likesvirtualization. Makers of new computers generally don’t includethe install media.

As discussed indifferent email, my vintage machine has multiple hard drives.  The bad drive has the Linux root partition. The old, but working, drives have the windows OS, so I couldsimply buy a new SATA drive and re-install Debian per #1.

I also was wondering about Windows recommending a back up. Re-thinking, I realize that a lot of MS offers to help arejust “feel goods”. Like the pop-up that says I have a high-speedUSB device in a low-speed port and offering to find a fasterconnection. It searches, says none found. Then opens the samepop-up 5 minutes later. Same for the one that says that the disk(partition) is full, but “cleanup” has already been done.

There are probably better save and recovery choices.

Back to hardwareoptions:

A. While I have thebox open, replace the mother board and get newer CPU. After a bit ofresearch, there are so many options that should go someplace likeCentral Computer or Zareason or ??? to make sure all of the piecesare compatible.

B. Buy assembledlaptop or desktop. Expect that the MSRP for a laptop should be morethan a desktop.

If it wasn’t BlackFriday, I would just get a SATA drive. Maybe I can find a gooddiscount on a machine that is just old enough to have good Linuxsupport. Other emails recommended getting a chip set that is abouta year old.


























On Wednesday, November 21, 2018, 7:37:41 PM PST, Rick Moen <***@linuxmafia.com> wrote:

Quoting Paul Zander (***@ieee.org):


... deleted discussion of disk diagnostics.

If memory serves, there is a fairly easy way to create an optical disc
image housing the contents of the 'recovery partition', so that you have
at least that half-measure reinstallation software on something tht
won't go 'Pfft!' if the hard drive fails.  Of course, the time and
trouble and (tiny) expense of burning the DVD is on you, and only an
insignificant percentage of MS-Windows users think to do so -- before
the hard drive fails or becomes unreliable, at which point, oops, it's
too late.

What is a 'recovery' image or partition?  This is a deliberately
reduced-functionality MS-Windows installer that offers no installation
options, e.g., doesn't permit you to state how to partition the target
hard drive and which existing partitions to leave alone.  Instead, IIRC,
it blows away 100% of the existing contents and all filesystems on the
target drive and reconstructs the hard drive contents exactly the way
the hard drive was partitioned and loaded by the OEM.  This means the
'recovery' installer will blow away alternative-OS contents on that
drive, and will reinstall MS-Windows 7 plus all of the 'ratware'
third-party junk and advertising that the OEM accepted money to throw
into the bundle.


More of what you really didn't want to hear:  When your PC was
brand-new, it was strongly in your interest to stop and think:  'Where's
my off-system means of reinstalling the operating system and (any)
bundled software?  I.e., where's my master installation copy of the
software I'm paying for?'  Any time you pay for proprietary software,
you're supposed to receive a reinstallable master copy, either on an
optical disc (sometimes with an activation code), or possibly as a
downloadable installable set of file (sometimes with an activation
code).  This being something you paid to acquire, and knowing that
accidents happen to computers (including but not limited to failing hard
drives and malware), you make sure you have the complete means to
reinstall that software from scratch tucked away somewhere off-system,
right?  Like on a DVD scrawled with the activation code (if any) in
Sharpie on the front?

So, where's your offsystem installable copy of MS-Windows 7, the OS
you paid for as part of your once-new PC?  What is your plan of action
when, not if, the hard drive it's OEM-installed onto fails?

It boggles me, still, to this day, but your average MS-Windows user
never planned for this -- and so is completely unaware that he/she has
been totally screwed over.

Maybe I'm wrong in the above guess.  Maybe you, Paul Zander, have a
proper retail copy of MS-Windows 7, acquired separately from your PC,
with required activation code, sitting in a ziplock bag in your office.
But the smart money's on 'Gee, I don't have that.'

If the latter, what do you do?  Personally, I am firmly of the view
that, if you don't have a fully installable master copy of your software
(or at least the ability to acquire one quickly), then you don't
_really_ own it, because you're one hardware fault away from losing it
completely.

As I see it:

1.  You could go buy MS-Windows 7.  Again.  And maybe make sure it isn't
a restricted 'recovery disc' or such.

2.  You could decide you've had enough of awful compromises and abusive
customer relationships, and go open source.  (I said farewell to
Microsoft operating systems on my computers for good around 1992.
For me, the last straw was when MS-Windows for Workgroups 3.11
hard-froze while I was copy-editing articles for _Blue Notes_ magazine,
the 40 page monthly newsletter of San Francisco PC User Group,  My
copy-editing session were a pair of DOS sessions running shareware ASCII
editor QEdit on several files that I had been frequently updating to a
floppy disk.  However, upon reboot, I found that not only had I lost the
contents of the QEdit buffers in RAM (no big loss), but also that
Windows for Workgroups had rewritten the edited files on-disck to zero
length, destroying hours of saved work.

That was it.  I was done.  That machine got reloaded with the beta of
OS/2 2.0 the same day, and then converted over to Linux a couple of
years later.

3.  Work with the terrible hand you've been dealt.

If you can still boot that hard drive (which I believe you said you
can?), then perhaps you can (at least) create a 'recovery disc' on
optical media.  That sucks, but it's way better than nothing.  Or
perhaps you already have such an optical disc.  I wasn't clear on that
from your very fragmentary description.

If your PC came with a 'recovery partition', then it's probably fairly
easy to poke around and figure out how to use it.  Usually, this
involves pressing F8 or something like that during POST, bringing up a
menu of bootable targets.  One is probably labelled something like
'Repair Your Computer' or something of that sort.  If so, that's the
recovery partition.  I would guess that it's perfeclty safe to boot that
partition to look around, i.e., I would guess that you are asked to give
confirmation before the 'recovery' installer blows away everything and
overwrites the hard drive.  Suggest you explore that.

Inside MS-Windows 7 itself, I see alleged on the Web that there's a
built-in utility (Start menu, Back up and Restore, Create a System
Image) where you can request that the contents of the recovery partition
be burned / copied to somewhere.

Once you have, say, your 'recovery' image stored bootably on an optical
disc or a spare external hard drive, or someting like that, you can
experiment with it to see if you can trick it into doing something
useful.  Like, for example, you could create a virtual machine inside
VirtualBox, then make the VM boot the 'recovery' installer, and let it
do a 'recovery' installation entirely within the VM -- writing only to
the VM's virtual disk file, not to the host OS's hard drive(s).  The
beauty of VM technology is that all software running there gets lied to
and told 'No, you have full control of an actual real computer.  You're
not running in a simulation. Trust me[tm].'  Such a Win7 installation
would then doubtless squawk about needing 'product activation'.  So, I
guess at that point you telephone Microsoft's telephone line for
activation, say you've moved Win7 into a VM, and badger them into
helping you.  Scuttlebutt says they're not unreasonable about this.

(This was one of your questions above, so this is my verbose way of
saying 'Yeah, at least one variant on your idea might well be a plan.')

Of course, in the alternative, maybe you imagine that you ought to
somehow back up the current exact state of your MS-Windows 7
installation including installed applications and somewhat ratty and
worse-for-the-wear Registry -- and then do a corresponding restore
operation later.  I personally think this is a tactical error, for
multiple reasons including MS-Window's tendency to accumulate bobbles
over time that are best dealt with by (infrequent) from-scratch
reinstallation of the OS, then from-scratch reinstallation of
applications, then re-creation of application configuration, then
restoration of user data files.

But, if you're feeling lucky and prefer to back up and later restore the
current exact state of your MS-Windows 7 installation including
installed applications, I'm sure you can somehow do that, possibly with
bundled backup/restore software, possibly with third-party software from
any of the cheerful publishers of proprietary MS-Windows utilities
standing by happy to accept your money.

Or, third alternative, there are ways to create _directly_ a virtual
disk image of your exact, literal, MS-Windows partition, for use under
VMware or VirtualBox.  This notion has been the subject of discussion
within the year on this mailing list, so, if interested, let me know and
I can dredge up Mailman archive links to the relevant back postings.
(Or, you know, you could research that yourself.  I don't have any magic
for finding such things.)


Disclaimer:  Although I'm confident what I say in relation to Windows 7
is correct in broad outline at least, you are solemnly advised to verify
particulars before follwing the advise of someone who gave MS-Windows
the heave-ho in disgust more than a quarter-century ago.


(A propos of nothing in particular, I'm pleased to note that QEdit's
current incarnation as The SemWare Editor = TSE is still around for
MS-DOS, OS/2, and MS-Windows users.  It was nice.)
p***@ieee.org
2018-11-23 21:20:31 UTC
Permalink
Thank you to everyone for letting me clutter you inbox.  I found it helpful just to write out the information.
So after a lot of web searching, I ordered a new Lenovo laptop from newEgg.  AMD Ryzen processor.  I checked the exact model of CPU and graphics against AMD web-site.
It has a 2.5" internal SATA.  After making sure it is in working order, I will carefully remove many small screws and remove the harddrive and install the drive that has been an external drive to run Linux on other laptop.  Then I will have one laptop that starts as Win10 for those "special" activities.  And another laptop that just boots from internal drive with Linux, and an "spare" drive with Win10.
And going back to email thread of many months ago, the install will be Debian Stable.
Lastly, install and test diagnostic utilities.  
Rick Moen
2018-11-28 09:53:37 UTC
Permalink
Thank you to everyone for letting me clutter you inbox.  I found it
helpful just to write out the information. So after a lot of web
searching, I ordered a new Lenovo laptop from newEgg.  AMD Ryzen
processor.  I checked the exact model of CPU and graphics against AMD
web-site. It has a 2.5" internal SATA.
Lucky you. I imagine you're going to be really happy with this puppy.
After making sure it is in working order, I will carefully remove many
small screws and remove the harddrive and install the drive that has
been an external drive to run Linux on other laptop.  Then I will have
one laptop that starts as Win10 for those "special" activities.  And
another laptop that just boots from internal drive with Linux, and an
"spare" drive with Win10.
Seriously: Try _an SSD_ in the new Lenovo. You'll never settle for
spinning rust again, after you've seen the huge performance difference.
Plus, they're utterly silent, draw far less power, emit essentially no
heat, and reduce system weight compared to spinning rust (hard drives).

But really, it's the insane performance difference that will turn your
head.

Spinning rust continues to have a very legitimate role when you need
large _amounts_ of storage. It costs a lot less per gig. But you can
have both fast primary storage _and_ capacious ancillary storage by
having an SSD inside your machine, and a hard drive (or RAID array of
them) in an external, detachable enclosure.
Lastly, install and test diagnostic utilities.  
Well, at least download them and archive a copy somewhere.
The classic way to do that is on an optical disc, which is still
pragmatic although it may seem a bit old-hat. Burn the disc,
label it with a Sharpie, pack it away as a master software archive in
case ever needed.
Deirdre Saoirse Moen
2018-11-30 20:51:57 UTC
Permalink
To make Rick's point: when we switched from MacBook Pros with spinning rust :) to MacBook Airs with SSDs about five years ago, we were surprised how much faster our machines were even though the CPU was so much less powerful.

I don't think I could go back to a primary drive that was a hard drive.
--
Deirdre Saoirse Moen
Post by Rick Moen
Seriously: Try _an SSD_ in the new Lenovo. You'll never settle for
spinning rust again, after you've seen the huge performance difference.
Plus, they're utterly silent, draw far less power, emit essentially no
heat, and reduce system weight compared to spinning rust (hard drives).
But really, it's the insane performance difference that will turn your
head.
p***@ieee.org
2018-12-01 20:37:47 UTC
Permalink
We have to remember that the physics for all hard drives even before there were 8" floppy disks was oxides of iron and similar metals.
The memory hierarchy concept continues.  $100 can buy:
*  16 GB DDR4
*  500GB USB 3.1
*  2TB HDD with 128MB cache SATA

Also, 8GB RAM is "typical" for laptops on the market today.


On the topic of disk drives.  Previously I had reported that the Seagate Disk diagnostic did not run under Windows.  I have since discovered that it requires rebooting the system.  Then it runs all the time in the background.  In contrast the WD disk diagnostic for Windows runs when you invoke it and then quits.  Both tools will recognize SG, WD and Toshiba Drives.  They can read the S.M.A.R.T. information and have a menu for several other tests on a specified drive. 

In the near future, I will try the SG bootable USB diagnostic.  If it works, it will go with the recovery drives and data back up in a different room.

A few years ago, I bought 1 TB HDD with 64?GB cache.  After a couple of reboots, the time to boot Linux went from ~30 seconds to under 5 seconds. 




On Friday, November 30, 2018, 12:54:52 PM PST, Deirdre Saoirse Moen <***@deirdre.net> wrote:

To make Rick's point: when we switched from MacBook Pros with spinning rust :) to MacBook Airs with SSDs about five years ago, we were surprised how much faster our machines were even though the CPU was so much less powerful.

I don't think I could go back to a primary drive that was a hard drive.
--
  Deirdre Saoirse Moen
Seriously:  Try _an SSD_ in the new Lenovo.  You'll never settle for
spinning rust again, after you've seen the huge performance difference.
Plus, they're utterly silent, draw far less power, emit essentially no
heat, and reduce system weight compared to spinning rust (hard drives). 
But really, it's the insane performance difference that will turn your
head. 
Tony Godshall
2018-12-02 02:14:58 UTC
Permalink
Ummm, I've bought 4-5tb in USB 3.1 for ~$99 at Costco
Post by p***@ieee.org
We have to remember that the physics for all hard drives even before there
were 8" floppy disks was oxides of iron and similar metals.
* 16 GB DDR4
* 500GB USB 3.1
* 2TB HDD with 128MB cache SATA
Also, 8GB RAM is "typical" for laptops on the market today.
On the topic of disk drives. Previously I had reported that the Seagate
Disk diagnostic did not run under Windows. I have since discovered that it
requires rebooting the system. Then it runs all the time in the
background. In contrast the WD disk diagnostic for Windows runs when you
invoke it and then quits. Both tools will recognize SG, WD and Toshiba
Drives. They can read the S.M.A.R.T. information and have a menu for
several other tests on a specified drive.
In the near future, I will try the SG bootable USB diagnostic. If it
works, it will go with the recovery drives and data back up in a different
room.
A few years ago, I bought 1 TB HDD with 64?GB cache. After a couple of
reboots, the time to boot Linux went from ~30 seconds to under 5 seconds.
On Friday, November 30, 2018, 12:54:52 PM PST, Deirdre Saoirse Moen <
To make Rick's point: when we switched from MacBook Pros with spinning
rust :) to MacBook Airs with SSDs about five years ago, we were surprised
how much faster our machines were even though the CPU was so much less
powerful.
I don't think I could go back to a primary drive that was a hard drive.
--
Deirdre Saoirse Moen
Post by Rick Moen
Seriously: Try _an SSD_ in the new Lenovo. You'll never settle for
spinning rust again, after you've seen the huge performance difference.
Plus, they're utterly silent, draw far less power, emit essentially no
heat, and reduce system weight compared to spinning rust (hard drives).
But really, it's the insane performance difference that will turn your
head.
_______________________________________________
conspire mailing list
http://linuxmafia.com/mailman/listinfo/conspire
_______________________________________________
conspire mailing list
http://linuxmafia.com/mailman/listinfo/conspire
Tony Godshall
2018-12-02 02:15:51 UTC
Permalink
Oops ignore that one. This is my subscribed address.
Post by Tony Godshall
Ummm, I've bought 4-5tb in USB 3.1 for ~$99 at Costco
Post by p***@ieee.org
We have to remember that the physics for all hard drives even before
there were 8" floppy disks was oxides of iron and similar metals.
* 16 GB DDR4
* 500GB USB 3.1
* 2TB HDD with 128MB cache SATA
Also, 8GB RAM is "typical" for laptops on the market today.
On the topic of disk drives. Previously I had reported that the Seagate
Disk diagnostic did not run under Windows. I have since discovered that it
requires rebooting the system. Then it runs all the time in the
background. In contrast the WD disk diagnostic for Windows runs when you
invoke it and then quits. Both tools will recognize SG, WD and Toshiba
Drives. They can read the S.M.A.R.T. information and have a menu for
several other tests on a specified drive.
In the near future, I will try the SG bootable USB diagnostic. If it
works, it will go with the recovery drives and data back up in a different
room.
A few years ago, I bought 1 TB HDD with 64?GB cache. After a couple of
reboots, the time to boot Linux went from ~30 seconds to under 5 seconds.
On Friday, November 30, 2018, 12:54:52 PM PST, Deirdre Saoirse Moen <
To make Rick's point: when we switched from MacBook Pros with spinning
rust :) to MacBook Airs with SSDs about five years ago, we were surprised
how much faster our machines were even though the CPU was so much less
powerful.
I don't think I could go back to a primary drive that was a hard drive.
--
Deirdre Saoirse Moen
Post by Rick Moen
Seriously: Try _an SSD_ in the new Lenovo. You'll never settle for
spinning rust again, after you've seen the huge performance difference.
Plus, they're utterly silent, draw far less power, emit essentially no
heat, and reduce system weight compared to spinning rust (hard
drives).
Post by Rick Moen
But really, it's the insane performance difference that will turn your
head.
_______________________________________________
conspire mailing list
http://linuxmafia.com/mailman/listinfo/conspire
_______________________________________________
conspire mailing list
http://linuxmafia.com/mailman/listinfo/conspire
Rick Moen
2018-12-02 02:28:50 UTC
Permalink
Post by Tony Godshall
Oops ignore that one. This is my subscribed address.
Friendly listadmin tip, to assist those using a variety of e-mailboxes,
as many do these days: In addition to the address you _intend_ to post
from, on Conspire, also subscribe all of your alternate e-mailboxes.
As you subscribe each of those, visit your subscription options page and
set the 'no mail' flag for that additional subscribed address.

Now, if you absent-mindedly send mailing list mail from the wrong
posting address (as people often do), it'll nonetheless go straight
through rather than landing in Mailman's administrative queue (pending
my manual approval) on account of arriving from a 'non-subscribed address.
Having set 'no mail' prevents you from getting multiple copies of each
new posting.
p***@ieee.org
2018-12-02 16:27:05 UTC
Permalink
Costco can be a good place to buy a computer.  Instead of searching for info on the internal details, you can just take it home and see if it works for your particular purposes, software, add-ons.  Just keep the packaging and receipt and watch the calendar.
On the other hand, things come and go.  What are the chances that I could find the USB memory if I went there tomorrow?  For that matter, different stores have different items in stock or not.


On Saturday, December 1, 2018, 10:15:19 PM AST, Tony Godshall <***@gmail.com> wrote:

Ummm, I've bought 4-5tb in USB 3.1 for ~$99 at Costco


On Sat, Dec 1, 2018, 12:51 PM ***@ieee.org <***@ieee.org> wrote:

We have to remember that the physics for all hard drives even before there were 8" floppy disks was oxides of iron and similar metals.
The memory hierarchy concept continues.  $100 can buy:
*  16 GB DDR4
*  500GB USB 3.1
*  2TB HDD with 128MB cache SATA

Also, 8GB RAM is "typical" for laptops on the market today.
Rick Moen
2018-12-02 02:21:30 UTC
Permalink
Post by p***@ieee.org
We have to remember that the physics for all hard drives even before
there were 8" floppy disks was oxides of iron and similar metals.
*  16 GB DDR4
*  500GB USB 3.1
*  2TB HDD with 128MB cache SATA
Also, 8GB RAM is "typical" for laptops on the market today.
Vocabulary word for the day is 'amortisation' (or, for most people here,
'amortization').

Which is to say, the trick when purchasing things like computer hardware
with high likelihood of economic service life extending over several
years, the trick is to do your planning based on estimated value each
year until decommissioning.

Hardware that you can expect to be fully usable for ten years is worth
spending for, even if it costs twice as much as alternative choices you
guess you'll decomission (or put in a closet in four years).

I keep seeing people buying underdesigned gear that'll be unsatisfactory
in only a few years, merely because it was 'cheap', ignoring it being
actually quite expensive on an annual basis over its service life.

More at:
http://linuxmafia.com/~rick/lexicon.html#moenslaw-bicycles
(As /usr/games/fortune just said to me: 'Cheap things are of no value,
valuable things are not cheap.')

I'm sure you're correct that 8GB is 'typical' for laptops on that market
today. IMO, the right thing to do is determine before buying what'd
be required, eventually, to substantially increase that. Find out how
many memory slots the unit has, how many are currently occupied, whether
DIMMs must be provisioned in matched pairs or not, what max. RAM the
motherboard supports, and how much it currently costs to (say) quadruple
total RAM by buying high-density DIMMs of the supported type on the
retail market (from, say, http://www.satech.com/ ).

You'll thank yourself later for doing that preparation up-front.
Peter Knaggs
2018-12-04 02:04:17 UTC
Permalink
On the topic of well-designed things, the little iRiver iHP-120 audio
recorder/player
can be upgraded nowadays to use a Compact Flash card instead of the
original
"spinning rust" Toshiba 20GB (MK2004GAL) 80mm hard drive. Essentially the
same
concept of upgrading from spinning rust to SSD, except over a considerably
longer
time frame. I think the iRiver iHP-120 is almost twenty years old now, but
it still works fine
thanks to the free Rockbox firmware. Even the original Toshiba hard drive
surprisingly still
works (it's required in order to be able to update the firmware on the
iHP-120 so that it
can work with Compact Flash storage). Some photos of what the modification
(still somewhat
of a a work in progress, but currently usable) looks like are here:

http://www.penlug.org/foswiki/bin/view/Main/HardwareInfoIriverIhp120
Post by Rick Moen
Post by p***@ieee.org
We have to remember that the physics for all hard drives even before
there were 8" floppy disks was oxides of iron and similar metals.
* 16 GB DDR4
* 500GB USB 3.1
* 2TB HDD with 128MB cache SATA
Also, 8GB RAM is "typical" for laptops on the market today.
Vocabulary word for the day is 'amortisation' (or, for most people here,
'amortization').
Which is to say, the trick when purchasing things like computer hardware
with high likelihood of economic service life extending over several
years, the trick is to do your planning based on estimated value each
year until decommissioning.
Hardware that you can expect to be fully usable for ten years is worth
spending for, even if it costs twice as much as alternative choices you
guess you'll decomission (or put in a closet in four years).
I keep seeing people buying underdesigned gear that'll be unsatisfactory
in only a few years, merely because it was 'cheap', ignoring it being
actually quite expensive on an annual basis over its service life.
http://linuxmafia.com/~rick/lexicon.html#moenslaw-bicycles
(As /usr/games/fortune just said to me: 'Cheap things are of no value,
valuable things are not cheap.')
I'm sure you're correct that 8GB is 'typical' for laptops on that market
today. IMO, the right thing to do is determine before buying what'd
be required, eventually, to substantially increase that. Find out how
many memory slots the unit has, how many are currently occupied, whether
DIMMs must be provisioned in matched pairs or not, what max. RAM the
motherboard supports, and how much it currently costs to (say) quadruple
total RAM by buying high-density DIMMs of the supported type on the
retail market (from, say, http://www.satech.com/ ).
You'll thank yourself later for doing that preparation up-front.
_______________________________________________
conspire mailing list
http://linuxmafia.com/mailman/listinfo/conspire
Michael Paoli
2018-11-22 23:10:32 UTC
Permalink
Date: Sun, 18 Nov 2018 19:21:55 +0000 (UTC)
Subject: [conspire] 3rd Master Hard Disk Error
Content-Type: text/plain; charset="utf-8"
During POST and before GRUB, one of my computers gives the error message
??? 3rd Master Hard Disk Error
and says press F1 to continue.
Well this PC has 3 hard drives.? Which one is it??? Doing a
* Access the BIOS to turn off the test.
* Replace "the hard drive".
At first I thought "3rd" meant a specific drive like /dev/hdc.??
Apparently computers with only 1 drive generate the same message.
I have taken the precaution to make copies of important data on
different physical drives.
Well, not a whole lot of detail (some more in subsequent, but
still not all that much).

Backups ... yep (whole 'nother topic)

So, first thing I think of with error diagnostics:
they're often the least exercise code in the system/software,
so, often though they're not so likely to be totally inaccurate,
it is fairly common that what they claim or state the issue to be
isn't quite right ... but if not spot on, it's typically something
closely, or at least somewhat associated with whatever' is actually
failing/failed or misbehaving, etc. - but even then, not always.
So, always take error diagnostics with a grain of salt and presume that
they may not be exactly correct (and even sometimes totally off-base).

"3rd Master Hard Disk Error"

So, we (or at least I would) start to question and be skeptical of
the diagnostics (sometimes they're quite unclear, e.g. I remember one
from BIOS years ago: "But Segment Doesn't Found"
So, yeah, diagnostics may not be highly clear in what they're
attempting to communicate.

So ... "3rd" ... 3rd ... what, drive? Or 3rd time that error's been
caught? Maybe it counts the errors but only counts/buffers up to 3,
and doesn't report 'till it's hit that threshold of 3? "3rd" ...
counting starting from 0, or 1?

Is there any hardware RAID involved? If so, the diagnostic might not
be about physical drive(s), but about some (sub-)component of the
RAID or some logical portion thereof, etc.

"Master" - sounds fairly likely to be "Master" of Master-Slave ...
where Slave may not even be present - so maybe referring to the
Master/Primary on IDE/ATA/PATA, and being the "3rd", it might count
'em:
1st: Master on 1st IDE/ATA/PATA bus
2nd: Slave on 1st IDE/ATA/PATA bus
3rd: Master on 2nd IDE/ATA/PATA bus
...
Does it count floppies first? Even if they're purely legacy/vestigial
and may not physically exist or have controllers, but might still be
in the BIOS code because nobody took it out yet?

I'd probably proceed - presuming there's no hardware RAID involved to ...
well, does also depend upon priorities and objectives. If one is more
concerned about data and less about host and its services remaining up
as feasible, may want to reboot to single user mode - or if keeping
services up/running as feasible is more important than the data, then
don't do that. ;-)

Try a full read of all the physical drives, e.g.:
might want to mount a bit 'o USB flash or SD or the like to have a
place to write some stuff ... or log stuff while connected over network
or via serial console ... anyway, maybe something like:
# dd if=/dev/sda ibs=512 of=/dev/null &
Can likewise do that for each drive (e.g. also /dev/sdb, /dev/sdc, ...
whatever drives you have that may possibly be suspect)
Can also redirect stderr on those, e.g. 2>sda.dd.err
Might also look at existing log files - it may have caught and logged
relevant error(s) there.
In any case, once those dd processes have completed, did any give any
errors?
If all the disks read all through without errors, then you may be in good
to excellent shape - and would remain the issue/question of the curiosity
of what caused the diagnostic and is there still an issue - or possibly
intermittent issue - or not.
You can also examine the SMART data on the drives - there are utilities
that can do that - that can be quite useful/informative. That can
also let you know if you have a drive that's failed or is in imminent danger
of doing so ... "of course" just because everything looks fine on the
drive doesn't mean it can't or won't fail at any time anyway (redundancy,
backups!, ...). In many cases, the SMART data & utilities can also
tell you if you have a version of firmware on the drive that should be
updated - some older firmware often has bugs - sometimes even
critical/important ones, where the drive firmware ought get updated.

If you got error(s) reading ... where, what drive, what's on it?
If the errors are solid/consistent, rather than intermittent, can
generally work to isolate - drive ... partition ... filesystem or
what have you ... what on or within filesystem or file or raw data
storage? Sometimes one may find the error is on the filesystem,
but not in an area that's in use currently. Sometimes one will find
the error is on something in use - e.g. within only and exactly one
specific file. Depending what it is, it might be semi-simple to
repair. If you get a hard read failure, for non-ancient drives, they
generally map bad blocks out - if it's a hard read error and has spare
blocks, it will still generally map it out when written. So, overwrite
the file (or that bit of it - know the data you need to fix it?), and
you "fix" the issue.

If you have the time/luxury to be able to take drive(s) off-line
and do destructive write tests on the drive(s), one can test quite
a bit more thoroughly. Those tests can also take quite a bit of
time, and may also be overkill, depending what one's objectives are.
In many such cases, I'll just do one single overwrite and read back -
especially for larger drives, and will mostly consider that "good enough"
for uses that aren't quite to highly critical. Also, random access on
spinning rust (hard drives, as opposed to SSD), can give very different
(typically much more thorough and closer to real world use most of the
time) results than sequential ... but sequential is much much faster for
spinning rust (and random access is of no significant advantage to test
over sequential, for SSD).

Oh, and if one has a SSD / hard disk hybrid drive (combines both)
... not sure how to best test something like that - might have to
resort to drive vendor tools and/or rely upon SMART data some fair
bit more. In general, vendor diagnostic/test tools can often do
fair bit more low-level stuff on the drive - but most of the time
I don't bother with such - I figure the drive works, or works
"well enough" and is "sufficiently reliable" ... or ... it ain't
and is time for wipe+ecycle. I figure at least most of the time
if I have to resort to vendor tools to get a drive working again,
that drive is too unreliable (notwithstanding possibly updating
drive firmware - and if feasible without needing vendor specific
tools/software to do that). One noteworthy exception - drive still
under warranty? May need to use vendor tools to convince vendor the
drive is failed and they need to replace it under warranty (thus far on
my personal stuff, really only had one drive fail within warranty,
and yes I did get warranty replacement drive).

I've had multiple occasions where I've gotten hard read failures on
drives, and have been able to "repair" them in such manner (e.g.
one drive, I had that happen a grand total of twice over a period
of about 10 years, and that drive was in nearly continuous operation
most all of that time, and earlier this year likewise "fixed"
a drive that had issue like that - and from there was then able to
remirror that drive to another drive - then replace the one that
had the error and then mirror back to it again to have good solid
RAID-1 once again and on non-flakey drives).
If you get the (very rare?) hard read errors on drive, you might consider
them at least somewhat less reliable - sometimes it happens. One may
want to replace them ... or not. SMART data can give you a fairly good
idea of how (semi-)okay ... or definitely not, the drive is - or if it's
likely to fail again soon or have developing progressive problems.

Anyway one of those "fix" stories I have from earlier this year is pretty
interesting - may pass that along, ... but alas, that was work goop, so
would need to redact at least some (notably identifying) bits of it (but
most is sufficiently generic it doesn't matter).
Michael Paoli
2018-11-23 19:23:59 UTC
Permalink
Subject: Re: 3rd Master Hard Disk Error
Date: Thu, 22 Nov 2018 15:10:32 -0800
Anyway one of those "fix" stories I have from earlier this year is pretty
interesting - may pass that along, ... but alas, that was work goop, so
would need to redact at least some (notably identifying) bits of it (but
most is sufficiently generic it doesn't matter).
So ... bit of background.

An f5 "appliance" - this particular case and "appliance", the f5 ...
actually a pair of them - one specifically addressed here - the nominal
standby), and the other the active primary. They're Linux based devices,
but ... "appliance" - most of that Linux stuff relatively "hidden",
behind-the-scenes, but ... actually accessible. And, "of course",
all of f5's stuff (software, etc.) layered atop that.

In this particular case, non-production ("lab"), and suffering some
neglect (priorities/resources) and failure(s), we're in situation where
we've got nominally pair of mirrored disks on the unit ... but one has
failed (actually quite a while earlier), and has been replaced (so
effectively no data on the replacement drive), but the other is also
giving hard failures(!). With the support from f5, we'd gotten to the
point where they're basically, "Yeah, you need to rebuild that from
scratch/backups - can't access/recover the data on there."

Myself, of course, knowing it's all atop Linux, I decide to dig a bit
and see if I can recover/fix it without too much difficulty, and avoid
the hassle/complexity (fair number of manual steps and time) to rebuild
and restore from backup(s), etc.

Also, Linux *based* ... but lots of f5 stuff atop that, so ... one can't
presume *too* much about it. E.g. *reading* data is generally fine,
but changing stuff can be (potentially very) hazardous - e.g. if one
changes thing(s) at Linux layer that f5 would expect to be changed
through the f5 interfaces, chaos may ensue - as the f5
software/configurations wouldn't know about or be expecting those
changes. So, one needs be quite mindful of that and sufficiently
careful. Anyway, notwithstanding that major caveat, one can learn, at
the Linux layers, much of what the "appliance" is doing, how, some of
where it is/isn't healthy, etc. And, in some cases, potentially -
carefully - fix/change *some* things. Also, this being the nominal
standby unit, rather than active, we've got much more flexibility to
bring things down, etc. - so that makes it quite a bit easier to deal
with.

Also, these particular f5 "appliances" - they're physical x86 based
hardware units. So there's a physical host at that level. They *also*
support virtualization, so there are *also* "guest" Virtual Machines
(VMs) running atop the hardware - again Linux-based, with f5 stuff atop
that - on both the host running direct on physical, and the "guest" VMs
within - which in most regards look very highly similar to the host
on physical - including all the f5 bits within (just doesn't have yet
another VM layer down underneath the guests - so no guests of guests).

So, again, the context - f5 - failed/failing hard drive (spinning rust),
nonrecoverable error(s), but otherwise (mostly) operational (at
Operating System (OS) level), etc. Two hard drives, nominally mirrored,
the "good" drive is effectively without data - it's a replacement that
was earlier installed for other disk that had much earlier failed
and was nowhere near current on the mirroring anyway. So at this
point essentially one good drive without data, and one problematic
drive (throwing hard errors).

Also, slight aside - folks will reasonably "disagree" / have differing
perspectives on complexity and it's disadvantages and (sometimes)
advantages (what exactly was gained by bringing in that complexity?).
The example here has much complexity (partitions, mdraid, LVM, virtual
machines, f5 vendor layer, and also most all that complexity also within
VM guests) - and that complexity has both disadvantages (much to know /
dig through), and also advantages too (e.g. able to "drill down",
isolate & correct problem - and problem impact also quite limited in
scope due to its isolation quite far down in the layers).

Without further introduction, and slightly redacted:

From: Michael Paoli [REDACTED]
Date: Mon, Aug 6, 2018 at 11:48 AM
Subject: Re: [E] RE: Regarding Service Request #[REDACTED] | | Follow
up from RMA [REDACTED] & [REDACTED]
To: <[REDACTED]@f5.com>
Cc: <[REDACTED]@f5.com>

Poked at it a bit over the weekend - remotely ...
was able to get the data recovered & all remirrored okay, without
rebuild/reinstall.
We'll still probably want to go ahead and replace the disk that had
the read failure, but at present I'm moving on to getting the
[REDACTED] unit properly remirrored, then will circle back to
replacing the quite suspect hardware (disk(s)) that still ought be
replaced ... even if they may be working at the moment (notably ones
that had hard failures earlier).

The short, and longer versions of the recovery without rebuild, in
case you're interested/curious:
basically managed to recover the disk that was giving unrecoverable read
errors, and get it remirrored onto replacement drive - using lower level
Linux bits (not fully sanctioned f5 approach, but fully doable ... so
long as we're sufficiently careful as to not conflict with any f5
specific bits).

[REDACTED]

unrecoverable read errors seen on:
/dev/md16 /shared/vmdisks
/dev/md16 835G 14G 780G 2% /shared/vmdisks
e.g.:
md/raid1:md16: dm-2: unrecoverable I/O read error for block 12812032
sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 35690808

on /shared/vmdisks filesystem, very few files and directories,
no directory read errors,
file read error only on:
[REDACTED].img
all other files read fine
*.img files used by guests (as shown by VM PID(s) having them open as
seen via fuser)
disabled guests until no PIDs had those *.img files open

can't copy full [REDACTED].img due to read errors

/shared/vmdisks filesystem is ext3 so journaling is used (overwrite of
file may be rewritten to different blocks rather than in place, and
probably is) can we read the *data* within [REDACTED].img (at least that's
actually used?)

# losetup -f [REDACTED].img
# losetup -a
/dev/loop0: [0910]:6436 ([REDACTED].img)
is image partitioned?
don't have partx ....

# sfdisk -uS -l /dev/loop0
Disk /dev/loop0: cannot get geometry

Disk /dev/loop0: 13054 cylinders, 255 heads, 63 sectors/track
Warning: The partition table looks like it was made
for C/H/S=*/16/63 (instead of 13054/255/63).
For this listing I'll assume that geometry.
Units = sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/loop0p1 * 1 460655 460655 b W95 FAT32
/dev/loop0p2 460656 2557295 2096640 82 Linux swap / Solaris
/dev/loop0p3 2557296 209713391 207156096 8e Linux LVM
/dev/loop0p4 0 - 0 0 Empty
#

Yes it's partitioned ...
What data within do we care about?
boot area +MBR (before start of earliest partition)
data within filesystems (not counting slack space) but also including
early space within filesystem (possibly boot data or similar)
don't care about slack space within
Which can/can't we read (and thus copy) and which are or may be
problematic?
Let's try straight copy, see how far we get (expecting it to fail)
# cp -p [REDACTED].img [REDACTED].img.COPY || ls -l [REDACTED].img*
cp: reading `[REDACTED].img': Input/output error
-rw-r--r-- 1 root root 107374182400 Aug 3 16:31 [REDACTED].img
-rw------- 1 root root 49137192960 Aug 3 16:51 [REDACTED].img.COPY
#
That's almost half - should have up through first 2 partitions, and fair
bit of 3rd, but not complete
So, what we're missing in terms of actual data, is (or may at least
include) stuff within 3rd partition ... but that's LVM ... what do we
actually have in there (and how much used and not?)

let's change our existing loop to ro:
# losetup -d /dev/loop0 && losetup -r -f [REDACTED].img; losetup -a
/dev/loop0: [0910]:6436 ([REDACTED].img)
#
The losetup we have doesn't have --sizelimit, but it does have
-o (offset)

# expr 2557296 \* 512
1309335552
# losetup -d /dev/loop1
# losetup -r -f -o 1309335552 [REDACTED].img && losetup -a
/dev/loop0: [0910]:6436 ([REDACTED].img)
/dev/loop1: [0910]:6436 ([REDACTED].img), offset 1309335552
#

# pvscan -u
...
PV /dev/loop1 with UUID wZSZUl-LrU9-kBIg-2EkM-G4wg-4nqt-J6FYVI VG
vg-db-vda lvm2 [98.78 GB / 51.91 GB free]
...
Here's a very interesting bit:
# ls -alsh [REDACTED].img
4.6G -rw-r--r-- 1 root root 100G Aug 3 16:31 [REDACTED].img
So ... (very) sparse file - mostly unallocated blocks.
GNU cp has an option to be highly efficient about sparse copies ...
but ... we wouldn't get a read error from a non-allocated block,
so, somewhere in the ~4.6G are read error(s)

VG name conflicts with existing so ...

# vgimportclone -n vg-db-vda.[REDACTED] -i /dev/loop1
WARNING: Activation disabled. No device-mapper interaction will be
attempted.
/tmp/snap.iWI12298/vgimport0: write failed after 0 of 4096 at 4096:
Operation not permitted
pv_write with new uuid failed for /tmp/snap.iWI12298/vgimport0.
0 physical volumes changed / 1 physical volume not changed
Fatal: Unable to change PV uuid for /tmp/snap.iWI12298/vgimport0, error: 5
#

but need to be rw to import ...

# losetup -d /dev/loop1 && losetup -f -o 1309335552 [REDACTED].img &&
losetup -a
/dev/loop0: [0910]:6436 ([REDACTED].img)
/dev/loop1: [0910]:6436 ([REDACTED].img), offset 1309335552
#

# vgimportclone -n vg-db-vda.[REDACTED] -i /dev/loop1
WARNING: Activation disabled. No device-mapper interaction will be
attempted.
Physical volume "/tmp/snap.BnJ12488/vgimport0" changed
1 physical volume changed / 0 physical volumes not changed
WARNING: Activation disabled. No device-mapper interaction will be
attempted.
Volume group "vg-db-vda" successfully changed
Volume group "vg-db-vda" successfully renamed to "vg-db-vda.[REDACTED]"
Reading all physical volumes. This may take a while...
Found volume group "vg-db-sdb" using metadata type lvm2
Found volume group "vg-db-sda" using metadata type lvm2
Found volume group "vg-db-vda.[REDACTED]" using metadata type lvm2
Found volume group "vg-db-cpmirror" using metadata type lvm2
# vgchange -a y "vg-db-vda.[REDACTED]"
16 logical volume(s) in volume group "vg-db-vda.[REDACTED]" now active
# lvs | fgrep vg-db-vda.[REDACTED] | sed -e 's/[ ]*$//'
dat.log.1 vg-db-vda.[REDACTED] -wi-a- 7.00G
dat.maint.1 vg-db-vda.[REDACTED] -wi-a- 300.00M
dat.share.1 vg-db-vda.[REDACTED] -wi-a- 20.00G
dat.swapvol.1 vg-db-vda.[REDACTED] -wi-a- 1.00G
set.1._config vg-db-vda.[REDACTED] -wi-a- 3.00G
set.1._usr vg-db-vda.[REDACTED] -wi-a- 2.20G
set.1._var vg-db-vda.[REDACTED] -wi-a- 3.00G
set.1.root vg-db-vda.[REDACTED] -wi-a- 392.00M
set.2._config vg-db-vda.[REDACTED] -wi-a- 512.00M
set.2._usr vg-db-vda.[REDACTED] -wi-a- 1.25G
set.2._var vg-db-vda.[REDACTED] -wi-a- 3.00G
set.2.root vg-db-vda.[REDACTED] -wi-a- 256.00M
set.3._config vg-db-vda.[REDACTED] -wi-a- 512.00M
set.3._usr vg-db-vda.[REDACTED] -wi-a- 1.25G
set.3._var vg-db-vda.[REDACTED] -wi-a- 3.00G
set.3.root vg-db-vda.[REDACTED] -wi-a- 256.00M
# vgdisplay vg-db-vda.[REDACTED]
--- Volume group ---
VG Name vg-db-vda.[REDACTED]
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 28
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 16
Open LV 0
Max PV 0
Cur PV 1
Act PV 1
VG Size 98.78 GB
PE Size 4.00 MB
Total PE 25287
Alloc PE / Size 11999 / 46.87 GB
Free PE / Size 13288 / 51.91 GB
VG UUID Pr3x0P-li8m-HUAR-K5w4-v7Y0-vQjL-YyABNk

#
Note that over half the VG is unallocated blocks
Which can we recover?
# mkdir recover
# (for f in /dev/mapper/vg--db--vda.[REDACTED]*; do b=`basename "$f"`
&& cat "$f" > recover/"$b"; done)
cat: /dev/mapper/vg--db--vda.[REDACTED]-set.1._config# : Input/output error
All LVs read fine, except the one immediately above
# blkid /dev/mapper/vg--db--vda.[REDACTED]-set.1._config
/dev/mapper/vg--db--vda.[REDACTED]-set.1._config:
LABEL="set.1./config" UUID="ba980429-7a35-46c5-925b-9f4eadcc4ba0"
SEC_TYPE="ext2" TYPE="ext3"
# mkdir [REDACTED].config && mount -o ro,nosuid,nodev
/dev/mapper/vg--db--vda.[REDACTED]-set.1._config [REDACTED].config
# df -h [REDACTED].config
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg--db--vda.[REDACTED]-set.1._config
3.0G 72M 2.8G 3% /shared/vmdisks/[REDACTED].config
#
Filesytem only 3% used anyway ...
# find [REDACTED].config ! -type d ! -type f ! -type l -print
Only files, directories, and sym links ... can we back them all up?
Yes, all read and backed up without error:
(cd [REDACTED].config && tar -cf - .) | gzip -9 > [REDACTED].config.tar.gz
What about metadata?
# blkid /dev/mapper/vg--db--vda.[REDACTED]-set.1._config
/dev/mapper/vg--db--vda.[REDACTED]-set.1._config:
LABEL="set.1./config" UUID="ba980429-7a35-46c5-925b-9f4eadcc4ba0"
SEC_TYPE="ext2" TYPE="ext3"
# umount [REDACTED].config && rmdir [REDACTED].config
# sfdisk -uS -d /dev/loop0
# partition table of /dev/loop0
unit: sectors

/dev/loop0p1 : start= 1, size= 460655, Id= b, bootable
/dev/loop0p2 : start= 460656, size= 2096640, Id=82
/dev/loop0p3 : start= 2557296, size=207156096, Id=8e
/dev/loop0p4 : start= 0, size= 0, Id= 0
# vgchange -a n "vg-db-vda.[REDACTED]"
# vgdisplay vg-db-vda.[REDACTED]
VG UUID Pr3x0P-li8m-HUAR-K5w4-v7Y0-vQjL-YyABNk
# vgrename Pr3x0P-li8m-HUAR-K5w4-v7Y0-vQjL-YyABNk vg-db-vda
Volume group "vg-db-vda.[REDACTED]" successfully renamed to "vg-db-vda"
# mv [REDACTED].config.tar.gz [REDACTED].img.COPY recover/
# rm recover/vg--db--vda.[REDACTED]-set.1._config
#

# ssh -ax -l root [REDACTED] 'cd /shared/vmdisks && { df -h .;
/bin/hostname; ls -d rec*; }'
Password:
Filesystem Size Used Avail Use% Mounted on
/dev/md16 835G 14G 780G 2% /shared/vmdisks
[REDACTED]
ls: rec*: No such file or directory
# (cd /shared/vmdisks/recover && tar -cf - .) | gzip -9 >
../[REDACTED].recover.tar.gz
# scp -p ../[REDACTED].recover.tar.gz
root@[REDACTED]:/shared/vmdisks/recover.[REDACTED]/
Password:
[REDACTED].recover.tar.gz 100% 1718MB 63.6MB/s 00:27
# cd /shared/vmdisks
# scp -p $(find [A-Z]* -name recover -type d -prune -o \( -type f !
-name [REDACTED].img \) -print)
root@[REDACTED]:/shared/vmdisks/recover.[REDACTED]/
Password:
[REDACTED]_GUEST.img 100% 100GB
65.8MB/s 25:56
[REDACTED]_GUEST.info 100% 23
0.0KB/s 00:00
[REDACTED].info 100% 23
0.0KB/s 00:00
[REDACTED]_GUEST.img 100% 100GB
65.4MB/s 26:06
[REDACTED]_GUEST.info 100% 23
0.0KB/s 00:00
#

# mkdir recover.2
# cp -p [REDACTED].img recover.2/[REDACTED].img.COPY2
# echo $?

#

Note that it actually copied without error that time,
drive likely managed to successfully remap the sector it couldn't read.

# smartctl -x /dev/sda 2>&1 | sed -ne '/^ID#/,/prefailure/p'
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 197 197 051 - 302
3 Spin_Up_Time POS--K 198 198 021 - 3100
4 Start_Stop_Count -O--CK 100 100 000 - 46
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 1
7 Seek_Error_Rate -OSR-K 200 200 000 - 0
9 Power_On_Hours -O--CK 093 093 000 - 5519
10 Spin_Retry_Count -O--CK 100 253 000 - 0
11 Calibration_Retry_Count -O--CK 100 253 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 44
192 Power-Off_Retract_Count -O--CK 200 200 000 - 36
193 Load_Cycle_Count -O--CK 200 200 000 - 20
194 Temperature_Celsius -O---K 124 100 000 - 26
196 Reallocated_Event_Count -O--CK 199 199 000 - 1
197 Current_Pending_Sector -O--CK 200 200 000 - 0
198 Offline_Uncorrectable ----CK 100 253 000 - 0
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 100 253 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
#

Notice Current_Pending_Sector is now down to 0 - whereas earlier it was 1,
so the bad sector has been mapped out now (either recovered and
remapped, or remapped when written to). So seems we're good now.

# already copied all this stuff to other host, so:
# rm [REDACTED].recover.tar.gz && rm -rf recover recover.2
#

# mdadm --detail /dev/md16 | less
/dev/md16:
Version : 0.90
Creation Time : Tue Jun 17 22:25:58 2014
Raid Level : raid1
Array Size : 877072320 (836.44 GiB 898.12 GB)
Used Dev Size : 877072320 (836.44 GiB 898.12 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 16
Persistence : Superblock is persistent

Update Time : Sat Aug 4 18:39:59 2018
State : active, degraded
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1

UUID : 165d54b9:8605b578:10b78e4a:3daa69b7
Events : 0.2291861

Number Major Minor RaidDevice State
0 253 2 0 active sync
/dev/vg-db-sda/mdm.app.vcmp.dat.vmdisks
1 0 0 1 removed

2 253 22 - spare
/dev/vg-db-sdb/mdm.app.vcmp.dat.vmdisks
#
# mdadm --remove /dev/md16 /dev/vg-db-sdb/mdm.app.vcmp.dat.vmdisks
mdadm: hot removed /dev/vg-db-sdb/mdm.app.vcmp.dat.vmdisks
# mdadm --zero-superblock /dev/vg-db-sdb/mdm.app.vcmp.dat.vmdisks
# mdadm --add /dev/md16 /dev/vg-db-sdb/mdm.app.vcmp.dat.vmdisks
mdadm: added /dev/vg-db-sdb/mdm.app.vcmp.dat.vmdisks
#

Did the remirror of md16 - it completed fine this time, and once it
completed, all looks fine on the drives.

While that was going on (actually started bit earlier), also did:
# dd if=/dev/zero of=.nulls obs=4096; rm .nulls
dd: writing to `.nulls': No space left on device
1719802064+0 records in
214975257+0 records out
880538652672 bytes (881 GB) copied, 7375.2 seconds, 119 MB/s
#

That was to write out most blocks on the filesystem - so any that might be
marginal would get remapped ... and the mirror completed fine, so all
we read okay.

Also ran [REDACTED] through a full reboot to ensure everything came up
and looked fine - that was also having reenabled the guests after all
the data was reading fine again.

Anyway, may not have shown all the steps in the preceding/above, but
showed at least key steps and data points.
Loading...