PPRuNe Forums

PPRuNe Forums (https://www.pprune.org/)
-   Computer/Internet Issues & Troubleshooting (https://www.pprune.org/computer-internet-issues-troubleshooting-46/)
-   -   Linux Corner (https://www.pprune.org/computer-internet-issues-troubleshooting/392362-linux-corner.html)

N727NC 16th November 2010 15:31

MB - breaking the mirror might let me get to the data, but might render both disks useless, so I am anxious only to go there when I have exhausted all other options (but I may have already done so).

MG23 - stupidly, I didn't take a record. I used the du -csh /* 2> ~/filelog to find the largest directory and, happy that it was not important to me, deleted it. This last time I deleted one of my own data folders - which I no longer needed - but even that was insufficient to get a proper logon.

Still stuffed ..........

Mike-Bracknell 16th November 2010 17:43


Originally Posted by N727NC (Post 6064894)
MB - breaking the mirror might let me get to the data, but might render both disks useless, so I am anxious only to go there when I have exhausted all other options (but I may have already done so).

Simply physically remove one of the disks from the server. Store it somewhere, and it can be your array master (for rebuilding the array) if you muck up mounting the remaining disk.

bnt 16th November 2010 20:36

If you get it back, can you install the baobab application? This is a friendly GTK+ app for examining disk usage information. I use it on Ubuntu, though the link I have suggests that you'd need to compile it from source. Anything major stands out pretty obviously.

N727NC 16th November 2010 20:40

I've recovered the machine using a 'failsafe' logon, sufficiently that I can take a backup off the data disks. When I've got the data I need, I'll flatten the server and start again.

Thanks to MB and MG23 - and the others before - for your support.

I still have no idea how it is filling several hundred gigabytes of disk in a few weeks. I'm sitting behind a Netgear firewall and the standard SuSe protections are running.

Thank you for that hint bnt - I'll certainly install baobab - it will help me to keep an eye on the disk consumption.

Unixman 17th November 2010 09:53

I would also suggest that a simple "lsof" might well be useful. This command lists all open files on a system and you might well be able to identify which file is causing a problem; alternatively use fuser -c filesystem to list all the processes that have files open on filesystem

For example (from a Solaris box)

# lsof | more

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
sched 0 root cwd VDIR 85,10 512 2 /
init 1 root cwd VDIR 85,10 512 2 /
init 1 root txt VREG 85,10 48952 1610 /sbin/init
init 1 root txt VREG 85,10 41088 4499 /lib/libgen.so.1
init 1 root txt VREG 85,10 51176 4537 /lib/libuutil.so.1
init 1 root txt VREG 85,10 23276 4494 /lib/libdoor.so.1
init 1 root txt VREG 85,10 143744 4526 /lib/libscf.so.1
init 1 root txt VREG 85,10 870760 4509 /lib/libnsl.so.1
init 1 root txt VREG 85,10 51780 4514 /lib/libnvpair.so.1
init 1 root txt VREG 85,10 37400 4528 /lib/libsecdb.so.1
init 1 root txt VREG 85,10 1640776 4480 /lib/libc.so.1
init 1 root txt VREG 85,10 101036 4510 /lib/libmd.so.1
init 1 root txt VREG 85,10 93924 4530 /lib/libsocket.so.1
init 1 root txt VREG 85,10 27100 4483 /lib/libcmd.so.1

<snip>

Always look for regular files (VREG above - but Linux could well be different)




# fuser -c /var
/var: 965o 651o 603c 602o 588o 580co 520o 509o 478o 476o 472o 462c 303co 7o

Then use ps -ef | grep pid

# ps -ef | grep 580
smmsp 580 1 0 Nov 04 ? 0:04 /usr/lib/sendmail -Ac -q15m

No surprise that sendmail is writing to /var :8


BTW is swapd running?

N727NC 17th November 2010 21:56

Getting There
 
My thanks to all for their constructive help - especially to UNIXMAN for suggesting lsof, which I'll be using when all is back to normal.

Progress to date is that I have rebuilt the OS disk from scratch, but I was unable to persuade the OS to let me rebuild the RAID. However, I managed to mount one of the RAID disks to recover the information - a full backup is running at the moment. When the backup is secure, I'll try to rebuild the RAID without reformatting both disks, in the hope that it will recover the mirror - any clues as to how to best go about this? At the moment one disk is Ext4 - I was obliged to reformat it after the initial OS rebuild, but the other still thinks it is part of a RAID - albeit I have managed to mount both disks as /tmp and /tmp1.

The current rig has a 3 GB partition mounted as /SWAP - would swapd be more efficient? (I think I see what you are getting at - might the swap file have grown to fill the data space? Answer is, I think, no - as I only have a fixed swap partition).

......."Weird Trip, Man" (Oora - Edgar Broughton Band, circa 1972).

Unixman 18th November 2010 03:57

I would stick with a fixed swap partition unless there is a very good reason not to.

One thing you don't say is whether you are using software or hardware RAID for your mirroring.

N727NC 18th November 2010 12:53

So Far So Good
 
The RAID is a linux software raid. I can't find a menu on the Dell to implement a hardware raid, which would have been my preference.

The OS is now running fine, and I am rebuilding the SAMBA server so that the various windoze clients can see their data again.

One issue is that because I mounted one of the drives as /tmp, there are lots of processes now using it, so the OS won't let me unmount it! Lesson to be learnt there, methinks. Any ideas on how to get out of this one?

rgbrock1 18th November 2010 13:42

Not a good idea or practice to mount a disk in the /tmp mount-point.

Start over and use some other syntax. (/temp and /temp2) If you don't start over you're going to have a host of issues.

Unixman 18th November 2010 17:12

Make certain that you haven't got an entry in /etc/fstab for the mounted filesystem; if you have then remove it and revert back to what should be in /tmp... then reboot

Mike-Bracknell 18th November 2010 17:53


Originally Posted by N727NC (Post 6069559)
The RAID is a linux software raid. I can't find a menu on the Dell to implement a hardware raid, which would have been my preference

What PowerEdge is it? and what's the PERC in it? (or just quote the tag number on the reverse and we'll find out)

Usually, Dell provides drivers for use at build time for OSes with the PERCs, and you might find that the PERC BIOS is accessible at POST. CTRL-M?

N727NC 29th November 2010 11:50

All Sorted
 
Thanks to all for your support - rebuilt machine now running like a dream and no repetition of full disk so far. Also, KDE4 seems to be behaving better than on the earlier build, so I'm guessing something became corrupted along the way. It's always frustrating when you can't identify the problem, but it seems to be behind us now.

Now, about that backup........

vulcanised 25th February 2011 11:44

Linux not so hot?
 
Or is it just the programmers?

London Stock Exchange halts trading over technical glitch - Telegraph

Not the first time since the new system started.

Mike-Bracknell 25th February 2011 15:23

Apparently it's all due to the migration to Millennium IT a few weeks back. Unlikely to be to do with the underlying OS.

mixture 25th February 2011 17:04

vulcanised,

Before you start on your Microsoft fanboy mantras, I suggest you look into the history of Microsoft at the LSE, it's not exactly pretty. :rolleyes:

Look, it's simple. A computer is an idiot. It knows nothing. The number of humans and lines of code involved in getting a system at the LSE up and running is probably beyond your comprehension (we're talking everything from the BIOS up to the applications the LSE run and absolutely everything in-between). It is an indisputable fact that due the ever increasing complexity of computing itself, plus the complex environment of the LSE, that the s**t will hit the fan. The thing is nobody knows when, how often, and how serious the error will be. In the vast majority of cases it's just little bugs that can be squashed without the journalists wetting themselves, but occasionally, something will happen that the external system users will notice.

Sorry for the tone, but I think someone had to tell things like they are !

vulcanised 25th February 2011 19:49


Microsoft fanboy mantras

I have never been accused of that and I'm most certainly not a fan of M$.

If you're not capable of making a point without such unwarranted drivel then you're better off remaining silent.

mixture 25th February 2011 20:05

You're the one who posted a link to overhyped journalistic nonsense.

No further comment.

AnthonyGA 25th February 2011 23:18

From what I've read, it looks like the problem is related to neither Microsoft's operating system nor Linux, but is instead a consequence of extremely poor IT management and design. A poor workman blames his tools.

You can build rock-solid systems on either type of operating system. You can also build garbage on either OS. If the previous system was using C# or .NET, those are already bad signs. A switch to Linux is also a bad sign. Both actions imply that the end user was simply trying to find the absolute cheapest, quickest "solution," without any regard for testing, safety, reliability, recovery, performance, etc.

You get what you pay for, and if you don't know how to write specs and/or don't know anything about IT, you usually get even less than you pay for.

MG23 26th February 2011 00:50


Originally Posted by AnthonyGA (Post 6271245)
A switch to Linux is also a bad sign. Both actions imply that the end user was simply trying to find the absolute cheapest, quickest "solution," without any regard for testing, safety, reliability, recovery, performance, etc.

Why would a switch to Linux imply that you weren't concerned about safety, reliability, performance, etc? I wouldn't run any important or safety-critical server software on Windows.

AnthonyGA 26th February 2011 05:53


Why would a switch to Linux imply that you weren't concerned about safety, reliability, performance, etc?
Because it's usually motivated by an attempt to save money, since many flavors of Linux are free. This desire to save money betrays an attitude that is more interested in money than in safety, reliability, etc. If you adopt free software in place of payware, you have to assume many of the responsibilities normally taken by the vendor of commercial software. But organizations that switch to Linux to save money typically aren't willing to assume those responsibilities, and so things go wrong.

The total burden of work and responsibility is going to be roughly the same no matter which operating system you use. Many organizations, when they try to switch to free software, are naïvely trying to get something for nothing.

If they simply wanted UNIX, then the obvious choice would be some sort of commercially-supported version of UNIX, which would bring all the technical support and responsibility of a paid vendor with it. The fact that Linux was chosen instead strongly implies that the only motivation was lowering costs, with all other considerations taking a back seat. The catch is that you cannot lower costs that way, all you're really doing is shifting them around (instead of paying money to a third party, you'll be spending it on payroll for your own employees).


I wouldn't run any important or safety-critical server software on Windows.
Neither Windows nor Linux is appropriate for mission-critical or safety-of-life software. For that you either need a mainframe (in the case of business software) or an embedded system (for safety-of-life software). You can use Linux or Windows for the latter, but not just off the shelf. And for all potential Linux applications, I prefer UNIX or a UNIX descendant instead.

On the desktop, only Windows or (in some cases) Mac OSX is appropriate, unless the desktop role is very tightly and deliberately constrained. For servers, in most cases, I'd install UNIX or its immediate relatives. Linux is popular mainly for reasons that are unrelated to technical considerations. I wouldn't put Windows on a server unless it had to support something that runs specifically on Windows, such as Microsoft Exchange Server or Windows domain management.


All times are GMT. The time now is 03:44.


Copyright © 2026 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.