1.11. Working with Cookies
1.11.1. Problem
You need to work with a system that uses cookies to store state, and you need to be able to set cookies as well as keep track of cookies set by the server.
1.11.2. Solution
HttpClient handles cookies automatically. If you need to keep track of a cookie set by the server, simply use the same instance of HttpClient for each request in a session. If you need to set a cookie, create an instance of Cookie, and add it to HttpState. The following example sends a Cookie to the server:
import java.io.IOException;
import org.apache.commons.httpclient.Cookie;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpMethod;
import org.apache.commons.httpclient.methods.GetMethod;
HttpClient client = new HttpClient( );
System.out.println( "Making Request without Cookie: " );
makeRequest(client);
System.out.println( "Making Request with Cookie: " );
Cookie cookie = new Cookie(".discursive.com", "test_cookie", "hello", "/", null, false);
client.getState( ).addCookie( cookie);
makeRequest(client);
private static void makeRequest(HttpClient client) throws IOException, HttpException {
String url = "http://www.discursive.com/cgi-bin/jccook/cookie_test.cgi";
HttpMethod method = new GetMethod( url );
client.executeMethod( method );
String response = method.getResponseBodyAsString( );
System.out.println( response );
method.releaseConnection( );
method.recycle( );
}
This example hits a CGI script that tests for the presence of a cookie named test_cookie. One request is made without the cookie and another request is made with the cookie. The following output is produced:
Making Request without Cookie:
<h1>test_cookie NOT PRESENT</h1>
Making Request with Cookie:
<h1>test_cookie PRESENT</h1 >
1.11.3. Discussion
Cookies are used by a number of application servers to manage user sessions; JSESSIONID cookies are used by most J2EE application servers and servlet containers to keep track of a session. Because HttpClient automatically handles cookies, if a server sets a cookie, it will be added to the HttpState instance associated with HttpClient. If you need to get a list of cookies associated with an HttpState, call getCookies( ) to obtain an array of Cookie objects. The following code retrieves an array of cookie objects, printing the domain, path, name, and value of each Cookie:
HttpClient client = new HttpClient( );
//execute some methods...
Cookie[ ] cookies = client.getState( ).getCookies( );
for( int i = 0; i < cookies; i ++ ){
Cookie cookie = cookies[i];
String domain = cookie.getDomain( );
String path = cookie.getPath( );
String name = cookie.getName( );
String value = cookie.getValue( );
System.out.println("Cookie: “ + domain + ", " + path+ ", "+name+ ", "+value);
}
There are two different approaches to cookies floating around the internet: Netscape Draft Specification and RFC 2109. Some servers use the Netscape Draft and others use RFC 2109; because of this, HttpClient offers a COMPATIBILITY mode that should work with most servers. The default cookie policy for HttpClient is the RFC_2109 policy. If you are having problems with cookies, change the cookie policy to the COMPATIBILITY policy, which is a public static int in the CookiePolicy class. To change the cookie policy, call setCookiePolicy() on the HttpState associated with
HttpClient, as follows:
HttpClient client = new HttpClient ( );
//To use a Compatability policy
client.getState( ).setCookiePolicy(CookiePolicy.COMPATIBILITY);
//To use a Netscape Draaft policy
client.getState( ).setCookiePolicy(CookiePolicy.NETSCAPE_DRAFT);
//To use a RFC2109 policy- this is the default
client.getState( ).setCookiePolicy(CookiePolicy.RFC2109);
There is also a third approach—outlined in RFC 2965—which supercedes RFC 2109. However, there is no code-level support for this third approach in commons yet.
1.12. Handling Redirects
1.12.1. Problem
You need to access a server that may send an arbitrary number of redirects.
1.12.2. Solution
Before executing an HttpMethod call, setFollowRedirects(true) on the method;HttpClient will take care of following any redirects a server may return in a response.The following example shows what happens when a method requests a CGI script that returns a 302 (moved temporarily) response code:
HttpClient client = new HttpClient ( );
//To use a Compatability policy
client.getState( ).setCookiePolicy(CookiePolicy.COMPATIBILITY);
//To use a Netscape Draaft policy
client.getState( ).setCookiePolicy(CookiePolicy.NETSCAPE_DRAFT);
//To use a RFC2109 policy- this is the default
client.getState( ).setCookiePolicy(CookiePolicy.RFC2109);
This example executes two GetMethod instances; the first method is configured not to follow redirects, and the second is configured to follow redirects. The first method is executed, and the server sends a 302 response code. Since this method is not configured to follow redirects, HttpClient does not make another request. When the second method is executed, HttpClient follows the initial redirect to a redirect2.cgi script, which sends another redirect to /jccook/index.html:
Executing Method not following redirects:
0 INFO [main] org.apache.commons.httpclient.HttpMethodBase - Redirect
requested but followRedirects is disabled
Response Code: 302
Executing Method following redirects:
Response Code: 200
<html>
<head>
<title>JCCook Example</title>
</head>
<body>
<h1>Hello World!</h1>
</body>
</html>
1.12.3. Discussion
HttpClient can handle any of the following response codes specifying a redirect:
Status Code 302: HttpStatus.SC_MOVED_TEMPORARILY
Status Code 301: HttpStatus.SC_MOVED_PERMANENTLY
Status Code:303: HttpStatus.SC_SEE_OTHER
Status Code 307: HttpStatus.SC_TEMPORARY_REDIRECT
When a response code is retrieved, HttpClient sends another GET request for the resource specified in the Location header. The following code is the first request sent by a method configured to follow redirects:
GET /cgi-bin/jccook/redirect.cgi HTTP/1.1
User-Agent: Jakarta Commons-HttpClient/3.0final
Host: www.discursive.com
The redirect.cgi script will then send a 302 Moved response, supplying a Location header that points to redirect2.cgi:
HTTP/1.1 302 Moved
Date: Sat, 15 May 2004 19:30:49 GMT
Server: Apache/2.0.48 (Fedora)
Location: /cgi-bin/jccook/redirect2.cgi
Content-Length: 0
Content-Type: text/plain; charset=UTF-8
HttpClient then sends another GET request for the resource specified in the previous response:
GET /cgi-bin/jccook/redirect2.cgi HTTP/1.1
User-Agent: Jakarta Commons-HttpClient/3.0final
Host: www.discursive.com
The redirect2.cgi is configured to send a redirect for /jccook/index.html, and the response to the previous request
does just that:
HTTP/1.1 302 Moved
Date: Sat, 15 May 2004 19:30:49 GMT
Server: Apache/2.0.48 (Fedora)
Location: /jccook/index.html
Content-Length: 0
Content-Type: text/plain; charset=UTF-8
How HttpClient handles redirect responses can be further customized by three configurable parameters on HttpClient. REJECT_RELATIVE_REDIRECT causes HttpClient to throw an exception if a server sends a Location header with a relative
URL; for instance, if the redirect.cgi script returns a Location header of ../index.html, the redirection causes an exception if REJECT_RELATIVE_REDIRECT is set to true. If ALLOW_CIRCULAR_REDIRECTS is set to true, HttpClient throws an exception if a series of redirects includes the same resources more than once. MAX_REDIRECTS allows you to specify a maximum number of redirects to follow. The following example sets all three parameters on an instance of HttpClientParams associated with an instance of HttpClient:
HttpClient client = new HttpClient;
HttpClientParams params = client.getParams( );
params.setBooleanParameter(HttpClientParams.REJECT_RELATIVE_REDIRECT, false);
params.setBooleanParameter(HttpClientParams.ALLOW_CIRCULAR_REDIRECTS, false);
params.setIntParameter(HttpClietnParams.MAX_REDIRECTS,10);
1.13. SSL
1.13.1. Problem
You need to execute a method using HTTP over Secure Sockets Layer(SSL).
1.13.2. Solution
If you are working with a server that has a certificate authority included in the Java Secure Socket Extension (JSSE), HttpClient automatically handles HTTP over SSL; just use a URL that starts with https. The following example retrieves Amazon.com's sign-in page using HTTP over SSL:
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpMethod;
import org.apache.commons.httpclient.methods.GetMethod;
HttpClient client = new HttpClient( );
String url = "https://www.amazon.com/gp/flex/sign-in.html";
HttpMethod method = new GetMethod( url );
client.executeMethod( method );
String response = method.getResponseBodyAsString( );
System.out.println( response );
method.releaseConnection( );
method.recycle( );
This example executes a simple GetMethod constructed with a URL starting with https. The output of this example is:
0 WARN [main] org.apache.commons.httpclient.HttpMethodBase - Response
content length is not known
297 WARN [main] org.apache.commons.httpclient.HttpMethodBase - Response
content length is not known
<html>
<head><title>Amazon.com Sign In</title>
</head>
.......... Content ..................
</html>
1.13.3. Discussion
HttpClient handles SSL automatically, if it can verify the authenticity of a certificate against an authority; this is why this recipe is so similar to Recipe 11.3. The example in this recipe only works if you are dealing with a site that has a certificate signed by a well-known authority. The Java Runtime Environment (JRE) keeps track of the signatures of all the known certificate authorities in a file named cacerts. cacerts can be found in
${JAVA_HOME}/jre/lib/security/cacerts; it is an archive that has a default password of changeit. For a list of certificate authorities in Java, execute the following command line and supply the default password:
keytool -list -keystore C:\j2sdk1.4.2_04\jre\lib\security\cacerts
The list will contain certificate fingerprints for Thawte, Entrust, Verisign, and other commercial certificate authorities. If you wish to use the JSSE without having to write your own ProtocolSocketFactory, you need to obtain a certificate signed by an authority.
1.14. Accept a Self-Signed Certificate
1.13.1. Problem
You need to work with a server that is using a self-signed certificate.
1.13.2. Solution
Provide a custom SSLProtocolSocketFactory that is configured to trust your self-signed certificate. A sample implementation of SSLProtocolSocketFactory named EasySSLProtocolSocketFactory is available via HttpClient's CVS repository, and the following example uses it to trust a self-signed certificate:
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpMethod;
import org.apache.commons.httpclient.contrib.ssl.EasySSLProtocolSocketFactory;
import org.apache.commons.httpclient.methods.GetMethod;
import org.apache.commons.httpclient.protocol.Protocol;
HttpClient client = new HttpClient( );
String url = "https://pericles.symbiont.net/jccook";
ProtocolSocketFactory socketFactory = new EasySSLProtocolSocketFactory( );
Protocol https = new Protocol (" https", socketFactory,443);
Protocol.registerProtocol("https",https);
HttpMethod method = new GetMethod( url );
client.executeMethod( method );
String response = method.getResponseBodyAsString( );
System.out.println( response );
method.releaseConnection( );
method.recycle( );
This executes and accepts the self-signed certificate from pericles.symbiont.net:
Word up, this page was served using SSL!
1.13.3. Discussion
EasySSLProtocolSocketFactory and EasyX509TrustManager can be obtained from HttpClient's CVS in the src/contrib directory. If you do not want to checkout the source code from CVS, you can also obtain these two classes from ViewCVS on cvs.apache.org. HttpClient's CVS repository can be accessed at http://cvs.apache.org/viewcvs.cgi/jakarta-commons/httpclient/, and the two classes are in the src/contrib/org/apache/commons/httpclient/contrib/ssl directory. To use these classes, you must integrate them into your own project, customizing the behavior of these classes as you see fit. EasySSLProtocolSocketFactory uses the EasyX509TrustManager to validate a certificate. To customize the criteria for certificate acceptance and alter the implementation of EasyX509TrustManager. For example, if you only want to accept a certificate from a specific hostname, change the implementation of the isServerTrusted() method in EasyX509TrustManager.