Can you imagine that we have been wasting terabytes of data in the internet because of newline characters and this has definitely resulted in a loss of atleast a billion dollars till now to our ISP.So how did this crazy thing happen ? Read with patience to learn about it.
Before we continue to analyse , have a look at the following two pages :
Both contain the same information except that the first html file contains a newline character after every line but the second file does not contain any newline character .
The below figure shows that there are NO NEWLINE CHARACTERS [ FILE SIZE =352 BYTES –> as shown by the control panel in 110mb.com in which both the sample html files are hosted ]
DOCUMENT HAS NEWLINE CHARACTERS [ FILE SIZE=370 BYTES ]
SO WHATS THE POINT ? :
A Simple document above wastes –> 18 BYTES OF UNNEEDED DATA TRANSFER IN THE INTERNET FOR EACH DOCUMENT ACCESS [ which has newline characters ] .now let us take another example .
Lets take Yahoo home page
compete.com says : There were .134 Billion visitors in December 2008 for yahoo.com .
Yahoo home page WITH NEWLINE CHAR.S :
EACH YAHOO HOME PAGE [ WITH NEWLINE CHAR.S] –> 40 KB
Thus Total data transfer for December 2008 –> 40 KB * .134 Billion = 5360 GB = 5.360 TERABYTES
Yahoo home page WITHOUT NEWLINE CHAR.S :
There are atleast 1000 newline characters in the home page [ as on january 9 2009 ]
Thus each newline character wastes about 2 bytes .
UNNEEDED BYTES IN YAHOO HOMEPAGE : 2 BYTES * 1000 NEWLINES = 2000 BYTES = 2 KB
AFTER REMOVING NEWLINE CHARACTERS IN YAHOO HOME PAGE [ approximately ]
-> 38 KB
Thus Total data transfer for December 2008 [ that would have reulted had the newline characters been removed ]–> 38 KB * .134 Billion = 5092 GB = 5.092 TERABYTES
TOTAL SAVINGS IN DATA TRANSFER BECAUSE OF REMOVING NEWLINES IN YAHOO HOME PAGE :
5.360 TB – 5.092 TB=0.268 TB !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
This is just for yahoo hompage alone.Lets take the case of crores of websites in the Internet.What would we save if we make an attempt to remove every unneeded newline character in the html source file ? I am not saying that has done to be done manually – i hope it is definitely very easy to have a automated program to remove them for us.isnt it ?