Text document size
Thread Starter
Joined: Mar 2001
Posts: 71
Likes: 0
From: UK/Spain
Text document size
Can someone explain this to me.
I have a text document file which is 568kb and am updating it with a new one which is considerably larger but shows its size as only 68kb.
Both files contain the same sort of info i.e. a series of coordinates, so don't quite understand why the larger one shows fewer kb?
I have a text document file which is 568kb and am updating it with a new one which is considerably larger but shows its size as only 68kb.
Both files contain the same sort of info i.e. a series of coordinates, so don't quite understand why the larger one shows fewer kb?
Thread Starter
Joined: Mar 2001
Posts: 71
Likes: 0
From: UK/Spain
I´ve checked in properties>file type and they are both called text document(.txt)
So not sure if I update, whether the new larger file (but smaller kb) is actually transfering all the new info... if you understand!
So not sure if I update, whether the new larger file (but smaller kb) is actually transfering all the new info... if you understand!
Thread Starter
Joined: Mar 2001
Posts: 71
Likes: 0
From: UK/Spain
I'm really puzzled, 'cos opened both files in notepad. File 1 contains about 60 lines of coordinates. File 2 has about 70 lines of similar cordinates, yet file 2 says 68kb and file 1 568kb.
Tried renaming and saving, still the same, so just dont understand how it can only show so few kbs?
Tried renaming and saving, still the same, so just dont understand how it can only show so few kbs?
Joined: Aug 2002
Posts: 3,663
Likes: 0
From: Earth
Zeppelin,
If it is genuinely a plain text file, then you should be able to work out the expected size yourself.
1 character = 1 byte
1 kilobyte = 1024 bytes (or 1000 for approximation purposes !)
Basic mathematics will then tell you how many bytes a string of your coordinates is, and consequently how many 60 lines will come to.
If it is genuinely a plain text file, then you should be able to work out the expected size yourself.
1 character = 1 byte
1 kilobyte = 1024 bytes (or 1000 for approximation purposes !)
Basic mathematics will then tell you how many bytes a string of your coordinates is, and consequently how many 60 lines will come to.
Administrator
Joined: Mar 2001
Aviation Qualifications: PPL
Posts: 8,121
Likes: 686
From: Twickenham, home of rugby
Line breaks (paragraph marks / carriage returns) and spaces all count!
Edit / select all - see if there's any "white space" in the large file.
SD
Edit / select all - see if there's any "white space" in the large file.
SD
Joined: Apr 2005
Posts: 366
Likes: 0
From: Earth
Not an answer, but the different file sizes you mention are a classic when comparing .doc with .docx file sizes. (The .docx being far smaller)
...or...
Track changes or version numbers of the document. (but that doesn't make sense, as you are in Notepad
You haven't snuck a picture in there, have you?
...or...
Track changes or version numbers of the document. (but that doesn't make sense, as you are in Notepad

You haven't snuck a picture in there, have you?

Joined: May 2009
Posts: 611
Likes: 62
From: Down under
Hmmm - that's a very large text file 
Along the lines of what SD has suggested, check that the larger file doesn't have some "hidden" characters or data hiding way past the end of your primary information. It's a very remote chance but with a simple processor like Notepad, that can sometimes happen.
I've encountered this when stripping satellite keplarian elements of spurious text information prior to uploading them to a prediction program.
FWIW
F-O-R

Along the lines of what SD has suggested, check that the larger file doesn't have some "hidden" characters or data hiding way past the end of your primary information. It's a very remote chance but with a simple processor like Notepad, that can sometimes happen.
I've encountered this when stripping satellite keplarian elements of spurious text information prior to uploading them to a prediction program.
FWIW
F-O-R
Joined: Aug 2002
Posts: 3,663
Likes: 0
From: Earth
very remote chance but with a simple processor like Notepad, that can sometimes happen.
You'd have to be inputting some pretty funky ASCII for that to happen !
I reckon he's probably created the doc in word or something else that generates formatting data.

Joined: May 2009
Posts: 611
Likes: 62
From: Down under
Erm, hidden characters in Notepad ???
You'd have to be inputting some pretty funky ASCII for that to happen !
You'd have to be inputting some pretty funky ASCII for that to happen !
Suggestion withdrawn!
It could be a bug. Similar (if isolated) reporting anomalies seem to have been observed elsewhere - see here
f-o-r
Last edited by FullOppositeRudder; 28th June 2011 at 00:40. Reason: link included
Joined: Apr 2005
Posts: 366
Likes: 0
From: Earth
For comparison
Using the old =RAND(1,5) text insertion trick; I got 1 paragraph, 98 words, 7 lines and 581 characters with spaces. The .txt file was 581 bytes, the .docx was 12.8 KB and the .doc was 22 KB. (This is just to illustrate the point about the different file formats - I'm not crazy or anything...)
Are you sure you're not using .doc???
Are you sure you're not using .doc???
Thread Starter
Joined: Mar 2001
Posts: 71
Likes: 0
From: UK/Spain
2shared - download File1.txt
2shared - download File2.txt
Ok, these are the 2 files- nothing very exciting, but intrested to know the difference
2shared - download File2.txt
Ok, these are the 2 files- nothing very exciting, but intrested to know the difference
Joined: Aug 2002
Posts: 3,663
Likes: 0
From: Earth
but intrested to know the difference
$ wc -l File1.txt
8714 File1.txt
$ wc -l File2.txt
20873 File2.txt
Removing everything apart from basic printable characters.....
$tr -cd '\40-\126' < File2.txt > filex.txt
$ ls -ltrh File2.txt filex.txt
-rw-r--r--@ 1 * * 565K 28 Jun 11:04 File2.txt
-rw-r--r-- 1 * * 524K 28 Jun 11:11 filex.txt
So you've just got a bigger text file. Nothing sinister hidden.
Per Ardua ad Astraeus
Joined: Mar 2000
Posts: 18,575
Likes: 4
From: UK
A quick look shows that at some stage in their lives, both files had 3 macros attached. They do not appear to be there now.
A hex look at each file shows nought.
File 1 has a carriage return after each line, file 2 has a % after each line. I suspect file2 was written by some data extraction programme and the mode of writing has added 'length' to each line by not closing it?
Sorry I cannot be more help, but I'm sure someone will know! It might help to tell us how you 'came by' each file.
A hex look at each file shows nought.
File 1 has a carriage return after each line, file 2 has a % after each line. I suspect file2 was written by some data extraction programme and the mode of writing has added 'length' to each line by not closing it?
Sorry I cannot be more help, but I'm sure someone will know! It might help to tell us how you 'came by' each file.



