buffer : Java Glossary

go to home page B words local find full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish 2008-01-08 by Roedy Green ©1996-2009 Canadian Mind Products
index page for letter ⇒ punctuation 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z (all)
buffer
Because of hard disk latency, when you do I/O, it will go faster if you do it in a few big physical I/O chunks rather than a number of small ones. If you wrote data one byte at a time, you would have to wait for the disk arms to snap to the correct cylinder, and for the platter to rotate round the correct spot every time you wrote a byte. If you buffered at 64,000 characters, you would have to do this wait only once every 64,000 characters. Mechanical motion is in the order of 1000 times slower than electronics.

If you wrote a byte at a time, since the hardware works in 512-byte sectors at a time, the OS would need to read the sector, plop your byte into it and write the entire sector back. This would take at least 2 disk rotations, perhaps 3. Even if you wrote your data 512 bytes at a time, when you went to write the next sector, its spot would have just past the head, so you would have to wait an entire rotation for its spot to come round. If you wrote 131,072 bytes (still less than 1 physical track) at a pop, you could do that all in one rotation.

Ideally, if you have enough RAM, you do the I/O in one whacking huge file-sized unbuffered chunk. Java has a number of classes that let you process a file buffered in convenient small logical chunks, often line by line. The buffered classes transparently handle the physical I/O in bigger chunks, typically 4096 bytes. The classes store each large chunk for physical I/O in a separate piece of RAM called a buffer. Unless the buffer size for the physical I/O is at least twice as big as the size of the logical chunks you process, there is not much point in buffering. The extra buffering copying overhead will just slow you down.

The File I/O Amanuensis will teach you how to do I/O either buffered or unbuffered. You can try it both ways, and see which works faster. You can also experiment with buffer sizes. The bigger the buffer, the fewer the physical I/Os you need to process the file. However, the bigger the buffer, the more virtual RAM you will use, which may trigger more swapping I/O. Further, there is not much point in having a whacking big buffer for a tiny file. It will take only a few I/Os to process the file anyway.

You will find that buffer sizes that are a power of two tend to work faster than other sizes. This is because disk and RAM hardware are designed around some magic sizes, typically 256, 512, 1024, 2048, 4096, 8192, 16,384, 32,768, 65,536, 131,072 and 262,144 bytes. Buffers that are powers of two naturally do I/O in physical chunks that align on powers of two boundaries in the file. This too makes the I/O more efficient because the hardware works typically in 512 byte sector chunks. If you do unbuffered I/O, likewise try to start your I/Os on boundaries that are even multiples of some power of two, the higher the power of two the better. e.g. it is better to start I/O on boundaries that are even multiples of 8096 rather than just 128. Sometimes it pays to pad your fixed-length records up to the next power of two. If you can help it, arrange your logical record size and buffer size so that logical records are aligned so that they never (or rarely) span two buffer fulls. It also helps to have your buffers aligned on physical RAM addresses that are even powers of two as well, though you have no control of that in Java.

In the olden days, CØBØL programs used double buffering. They used two or more buffers per file. The computer would read ahead filling buffers while the program was busy processing one of the previous buffers. Oddly, Java does not support this efficient serial processing technique, though sometimes the operating system maintains its own private set of read-ahead buffers behind the scenes. Unfortunately, the OS’s cascaded buffering is less efficient than using a single layer. You have the overhead of copying plus the wasted RAM for the buffers that are not actually used for physical I/O. Java never has more than one buffer per file and hence cannot simultaneously process and do physical I/O, unless of course it uses Threads. Even with Threads, you can’t pull off double buffering with any ease.

The term double buffering also refers to a technique of constructing Images off screen then blasting them onscreen once they are complete, as a way of creating smoother animation.

If you wrote 128K a byte at a time using a 64K buffer there would be only two physical 64K I/Os. This would be slightly slower that using unbuffered I/O to write the entire 128K in one I/O because of the extra physical I/O, the RAM overhead for the buffer and the CPU overhead of copying the data to the buffer.

When To Buffer


CMP homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.62] The information on this page is for non-military use only.
You are visitor number 14,526. Military use includes use by defence contractors.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/buffer.html J:\mindprod\jgloss\buffer.html