HTTP : Java Glossary

go to home page H words local find full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish 2008-08-22 by Roedy Green ©1996-2008 Canadian Mind Products
index page for letter ⇒ punctuation 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z (all)
HTTP
Hypertext Transfer Protocol. A protocol used on the Internet by web browsers to transport text and graphics. It is focuses on grabbing a page at a time, rather setting up a session. Applets also use it to download jars, classes and resources. Browsers use to download files and images, not just HTML text.
Browser To Server Speeding Up HTTP
Server To Browser response codes
Language & Charset Learning More
Sample Code Links
Under the Hood

Message Headers From Browser To Server

Fields in the headers let browsers and servers communicate. For example:
HTTP Headers that Browsers Send Servers
Field Typical Value Meaning
User-Agent: Opera ⇒ Opera/9.51 (Windows NT 6.0; U; en)
Firefox ⇒ Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
Sea Monkey ⇒ Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.15) Gecko/20080621 SeaMonkey/1.1.10
Flock ⇒ Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.16) Gecko/20080714 Firefox/2.0.0.16 Flock/1.2.4
Safari ⇒ Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Version/3.1.2 Safari/525.21
IE 8 ⇒ Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)
IE 7 ⇒ Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)
Netscape ⇒ Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.5pre) Gecko/20070710 firefox/2.0.0.4 Navigator/9.0b2
Which browser being used.
Host: localhost:8081 destination url, server:port.
Accept: application/xhtml+voice+xml;version=1.2, application/x-xhtml+voice+xml;version=1.2, application/x-shockwave-flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,text/css,*/*;q=0.1
or
text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
MIME types the browser is willing to accept. The encoding of this field, is described in RFC 2616 section 14. and in the more friendly w3.org version . Roughly the q numbers define your preference. The higher the number the higher the preference. Default is 1. The q applys to the preceding MIME. You set this with URLConnection.setRequestProperty( "Accept", …); not "accept" as the Sun docs erroneously suggest.
Accept-Language: en Language the browser in willing to accept.
Accept-Charset: windows-1252, utf-8, utf-16, iso-8859-1;q=0.6, *;q=0.1 Character set encodings the browser is willing to accept.
Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0 compression schemes the browser is willing to accept.
  • deflate: zlib format defined in RFC 1950 plus the deflate compression mechanism described in RFC 1951. This is a stripped down gzip without the header.
  • gzip, alias x-gzip: Java-style gzip RFC 1952 Lempel-Ziv coding with a 32 bit CRC.
  • compress, alias x-compress, UNIX compress
  • identity means as-is, no compression. Use in the Content-Request header, but not the Content-Encoding header. Just leave out the Content-Encoding if it is identity.
Referer: http://mindprod.com/jgloss/http.html the web page that contained the link that triggered this requenst.
If-Modified-Since: Mon, 06 Feb 2006 01:24:23 GMT Only bother with the request if the file has changed since this date, otherwise the browser already has a copy in cache.
Connection: Keep-Alive requests server keep the socket open for further messages.
Content-Type: application/x-www-form-urlencoded MIME type of the payload to the server.
Content-Length: 114 length in encoded bytes of the payload to the server.

Beware using HttpURLConnection.setFollowRedirects( false); This reportedly causes trouble in recent JDKs. When it is set true, it will not automatically follow responses with: <META HTTP-EQUIV="Refresh".

Message Headers From Server To Browser

HTTP Headers that Servers Send Browsers
Field Typical Value Meaning
Server: Apache/2.0.55 (NETWARE) mod_perl/1.99_12 Perl/v5.8.4 Which server software being used.
Accept-Ranges: bytes Inform the browser that the server supports downloading just parts of files, as small as a byte granularity.
Keep-Alive: timeout=15, max=99 how long to keep this socket open for more messages.
Connection: Keep-Alive requests browser keep the socket open for further messages.
Content-Type: image/png MIME type of the payload from the server. Also used to encode the CharSet encoding. e.g. Content-Type: text/html; charset=utf-8
Content-Encoding: gzip gizp or x-zip or deflate or not present if no compression.
Content-disposition: attachment;filename="smile.png" Server suggests a filename to save this download under.
Content-Length: 842 length in encoded bytes of the payload from the server.

In the real world, the conversations betweer browser/client and server are much more complicated as slipshod than you might suppose. Each query often results is a flurry of permanent and temporary redirects back and forth. Each element on an HTML page must be requested independently. Sometimes servers will send back a fail error code, then send the page anyway. Or they will send a 404 with an OK text response code. Sometimes servers refuse HEAD requests, but accept the equivalent GET. Sometimes servers send back https: in response to an http: request. Sometimes servers give you a totally different page from the one you requested and don’t tell you the one you wanted is on longer available. Sometimes servers rediret to localhost, or send back gibberish messages. Sometimes a server won’t send you a page if you have recently previously requested it. They expect you to have cached it. Browsers just do their best to muddle through. When you start emulating browsers with code, you get pretty flaky programs.

Language and Charset

You might wonder, where does the server encode the language and character set? Oddly not in the HTTP header, but embedded in the HTML documents, with tags like this:
<!-- embedding language and charset inside an HTML document -->
<meta http-equiv="Content-Language" content="en">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Embedding this information makes it easier for web page authors to control, even if it makes finding the information slightly more difficult for the browser.

Sample Code

This code for doing GET and POST is from the com.mindprod.http package. You can download the whole package. Code to do a GET:
Code to do a POST:
Base class for Get, Post, and Probe:
Code to read the response either as bytes (readBytesBlocking) or converted to a String (readStringBlocking):
Code to build the &parm=value command string:

Under the Hood

What happens when your Java-based browser requests a page?
  1. URL.openConnection just sets up a place to build the HTTP header. It does no communicating with the outside world.
  2. HTTPConnection.connect() requests sending the header to the server.
  3. This request triggers opening a TCP/IP socket connection to the server. This is done by sending a SYN connection request packet. The server sends back an SYN+ACK. Then the client sends an ACK, upon which may be piggybacked some data.
  4. This triggers sending the GET header composed of all the header fields set up before the .connect call. The GET request header includes a list of the encodings and compression algorithms the browser would like in response. .connect does not return until the HTTP header is safely sent out the wire.
  5. The browser calls HTTPConnection.getResponseCode to see if request went ok. This blocks until the server responds with an HTTP response header.
  6. Then the browser calls HTTPConnection.getInputStream and reads the text of the message from the server containing the requested web page. Using the standard TCP/IP protocol flow-control features, the server sends data only as fast the browser can read it.
  7. The browser then scans the web page for the urls of embedded images and puts out GET requests for them.
  8. Then various images usually come back from the server on the original socket. The browser could elect to request each image on it own socket so they can arrive simultaneously.
The stream is made purely of printable characters. The server can detect the start of a new GET request by looking for line terminators.

Speeding Up HTTP

There are several things you might consider to speed up HTTP transmissions.

Learning More

RFC 1521 (obsolete) MIME part 1.

RFC 1522 (obsolete) MIME part 2, non ASCII.

RFC 1945 HTTP 1.0 specification.

RFC 2045 MIME Part One: Format of Internet Message Bodies, specifies the various headers used to describe the structure of MIME messages.

RFC 2046 MIME Part Two: Media Types, describes the general structure of the MIME media typing system and defines an initial set of media types.

RFC 2047 MIME Part Three: Message Header Extensions for non-ASCII text

RFC 2048 MIME Part Four: Registration Procedures

RFC 2049 MIME Part Five: Conformance Criteria and Examples, Provides some illustrative examples of MIME message formats

RFC 2183 MIME Part Five: Conformance Criteria and HTTP Content-Disposition

RFC 2068 (obsolete) HTTP/1.1 protocol, obsolete.

RFC 2616 updates the HTTP protocol

RFC 2617: for details on how to send username and password in http headers to restrict access

RFC 2183 MIME Part Five: Conformance Criteria and HTTP Content-Disposition

Sun’s Javadoc on the URLConnection class : available:
Sun’s Javadoc on the HttpURLConnection class : available:
CGI
Details on HTTP headers
File I/O Amanuensis: to see how to write code that reads and writes via HTTP-CGI
forms: see the raw socket inforamation exchanged
HTTP Client
MIME
network properties
remote file access
response codes
RFC

CMP homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.62] The information on this page is for non-military use only.
You are visitor number 16,777. Military use includes use by defence contractors.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/http.html J:\mindprod\jgloss\http.html