PPRuNe Forums

PPRuNe Forums (https://www.pprune.org/)
-   Computer/Internet Issues & Troubleshooting (https://www.pprune.org/computer-internet-issues-troubleshooting-46/)
-   -   Text document size (https://www.pprune.org/computer-internet-issues-troubleshooting/455768-text-document-size.html)

Zeppelin 27th June 2011 05:06

Text document size
 
Can someone explain this to me.

I have a text document file which is 568kb and am updating it with a new one which is considerably larger but shows its size as only 68kb.

Both files contain the same sort of info i.e. a series of coordinates, so don't quite understand why the larger one shows fewer kb?

Spitoon 27th June 2011 05:53

Is it really a text file (i.e. .TXT)? If not, perhaps a MS Word doc, the extra size is probably down to the metadata stored with the file.

Zeppelin 27th June 2011 07:38

I´ve checked in properties>file type and they are both called text document(.txt)

So not sure if I update, whether the new larger file (but smaller kb) is actually transfering all the new info... if you understand!

green granite 27th June 2011 08:10

why not save it as xxx1.txt then you can compare before you over write the original.

Zeppelin 27th June 2011 09:24

I'm really puzzled, 'cos opened both files in notepad. File 1 contains about 60 lines of coordinates. File 2 has about 70 lines of similar cordinates, yet file 2 says 68kb and file 1 568kb.
Tried renaming and saving, still the same, so just dont understand how it can only show so few kbs?

mixture 27th June 2011 09:49

Zeppelin,

If it is genuinely a plain text file, then you should be able to work out the expected size yourself.

1 character = 1 byte
1 kilobyte = 1024 bytes (or 1000 for approximation purposes !)

Basic mathematics will then tell you how many bytes a string of your coordinates is, and consequently how many 60 lines will come to.

Mike-Bracknell 27th June 2011 12:00

568kb is pretty huge for a text file, so it's possible that it has some extraneous info in it somewhere.

Saab Dastard 27th June 2011 12:08

Line breaks (paragraph marks / carriage returns) and spaces all count!

Edit / select all - see if there's any "white space" in the large file.

SD

Spurlash2 27th June 2011 12:12

Not an answer, but the different file sizes you mention are a classic when comparing .doc with .docx file sizes. (The .docx being far smaller)

...or...

Track changes or version numbers of the document. (but that doesn't make sense, as you are in Notepad:(

You haven't snuck a picture in there, have you?

BOAC 27th June 2011 12:22

Check for a Macro?

FullOppositeRudder 27th June 2011 13:19

Hmmm - that's a very large text file :confused:

Along the lines of what SD has suggested, check that the larger file doesn't have some "hidden" characters or data hiding way past the end of your primary information. It's a very remote chance but with a simple processor like Notepad, that can sometimes happen.

I've encountered this when stripping satellite keplarian elements of spurious text information prior to uploading them to a prediction program.

FWIW

F-O-R

BOAC 27th June 2011 15:15

Hmm - I guess someone has to do it.............:)

mixture 27th June 2011 15:56


very remote chance but with a simple processor like Notepad, that can sometimes happen.
Erm, hidden characters in Notepad ???

You'd have to be inputting some pretty funky ASCII for that to happen !

I reckon he's probably created the doc in word or something else that generates formatting data.

FullOppositeRudder 28th June 2011 00:08


Erm, hidden characters in Notepad ???

You'd have to be inputting some pretty funky ASCII for that to happen !
OK - point taken.:ok:

Suggestion withdrawn! :suspect:

It could be a bug. Similar (if isolated) reporting anomalies seem to have been observed elsewhere - see here

f-o-r

Spitoon 28th June 2011 05:49

If the lists really are just co-ordinates and you wouldn't have to kill us afterwards, why not post the files up on the web somewhere for us to look at? I'm sure someone could give you an answer in seconds then!

mixture 28th June 2011 08:00


It could be a bug
Suggestion accepted ! :ok:

When in doubt, blame a bug (or, as per tradition, the user !).

Spurlash2 28th June 2011 08:39

For comparison
 
Using the old =RAND(1,5) text insertion trick; I got 1 paragraph, 98 words, 7 lines and 581 characters with spaces. The .txt file was 581 bytes, the .docx was 12.8 KB and the .doc was 22 KB. (This is just to illustrate the point about the different file formats - I'm not crazy or anything...)

Are you sure you're not using .doc???

Zeppelin 28th June 2011 09:36

2shared - download File1.txt

2shared - download File2.txt

Ok, these are the 2 files- nothing very exciting, but intrested to know the difference

mixture 28th June 2011 10:06


but intrested to know the difference
At first glance, about 12,159 lines.....

$ wc -l File1.txt
8714 File1.txt
$ wc -l File2.txt
20873 File2.txt

Removing everything apart from basic printable characters.....
$tr -cd '\40-\126' < File2.txt > filex.txt
$ ls -ltrh File2.txt filex.txt
-rw-r--r--@ 1 * * 565K 28 Jun 11:04 File2.txt
-rw-r--r-- 1 * * 524K 28 Jun 11:11 filex.txt

So you've just got a bigger text file. Nothing sinister hidden.

BOAC 28th June 2011 10:07

A quick look shows that at some stage in their lives, both files had 3 macros attached. They do not appear to be there now.

A hex look at each file shows nought.

File 1 has a carriage return after each line, file 2 has a % after each line. I suspect file2 was written by some data extraction programme and the mode of writing has added 'length' to each line by not closing it?

Sorry I cannot be more help, but I'm sure someone will know! It might help to tell us how you 'came by' each file.


All times are GMT. The time now is 13:32.


Copyright © 2026 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.