Java Tip 94: How to open a non-HTML document from a servlet

A simplified way to send non-HTML files to the Web client

To open a file in a browser from a servlet, you simply write the file to the servlet’s output stream. While that seems simple enough, you must be aware of some things when opening non-HTML documents such as binary data or multimedia files.

You start by getting the servlet’s output stream:

    ServletOutputStream out = res.getOutputStream();

The Internet community uses the MIME (multipurpose Internet mail extension) protocol to send multipart, multimedia, and binary data over the Internet. It is important to set the MIME type of the file you want to open in the servlet’s response object. For this example, I will open a PDF document.

MIME types

Web browsers use MIME types to identify non-HTML files and to determine how to present the data contained in them. Plug-ins can be associated with a MIME type or types, so that when the Web browser downloads a file with that MIME type, the browser also launches the plug-in that handles the file. Other MIME types can be associated with external programs. When the browser downloads files of those MIME types, it launches the appropriate program to view the downloaded file.

MIME types are useful because they allow Web browsers to handle various file types without having the built-in knowledge. Java servlets can use MIME types to send non-HTML files such as Adobe PDF and Microsoft Word to browsers. Using the proper MIME type helps to ensure that the file gets displayed by the proper plug-in or external viewer. The Resources section provides links to a list of defined MIME types and additional articles on MIME types.

The MIME type for a PDF file is "application/pdf". To open a PDF file in a servlet, you set the content type in the response header to "application/pdf":

    // MIME type for pdf doc
    res.setContentType( "application/pdf" );  

To open an Microsoft Word document, you would set the response object’s content type to "application/msword" instead of "application/pdf":

    // MIME type for MSWord doc
    res.setContentType( "application/msword" );

For an Excel document, use the MIME type "application/vnd.ms-excel". In that MIME type, vnd refers to application vendor that must be included to open the file.

In some cases, the browser doesn’t recognize the file’s MIME type. That often happens when the required plug-in hasn’t been installed for a certain file type. In those cases, the browser will pop up a dialog box, asking the user whether he or she wants to open the file or save it to disk.

Content disposition

An HTTP response header named content-disposition allows the servlet to specify information about the file’s presentation. Using that header, you can indicate that the content should be opened separately (not actually in the browser) and that it should not be displayed automatically but rather upon some further action by the user. You can also suggest the filename to be used if the content is to be saved to a file. That filename would be the name of the file that appears in the Save As dialog box. If you don’t specify the filename, you are likely to get the name of your servlet in that box. To find out more about the content-disposition header, check out Resources.

In the servlet, you want to set that header as follows:

    res.setHeader("Content-disposition",
                  "attachment; filename=" +
                  "Example.pdf" );
    // attachment - since we don't want to open
    // it in the browser, but
    // with Adobe Acrobat, and set the
    // default file name to use.

If you were opening an Microsoft Word file, you would choose:

    res.setHeader("Content-disposition",
                  "attachment; filename" +
                  "Example.doc" );

Wrapping it up

The rest is pretty simple. You need to create a java.net.URL object based on the name of the file you want to open. The string that is passed into the URL constructor should be a fully qualified URL to the file’s location. In this example, I open Adobe’s employment application form:

    String fileURL = 
"

Your URL string could be something like or http://www.gr.com/pub/somefile.xls, but make sure the file you’re opening is consistent with the MIME type that was previously set in the HTTP response object.

    URL url = new URL ( fileURL );

Firewalls

If your browser needs to go through a firewall, the last thing you need to worry about is making your URL connection. For that you need to find out some information about your proxy server, such as the host name and port number to establish a firewall connection. More information about establishing connections through a firewall can be found in the Resources section below.

If you are using Java 2, you should create a URLConnection object from the URL object and set the following system properties:

    URLConnection conn = url.openConnection();
    // Use the username and password you use to
    // connect to the outside world
    // if your proxy server requires authentication.
    String authentication = "Basic " + new
sun.misc.BASE64Encoder().encode("username:password".getBytes());
    System.getProperties().put("proxySet", "true");
    System.getProperties().put("proxyHost", PROXY_HOST); // your proxy host
    System.getProperties().put("proxyPort", PROXY_PORT); // your proxy port
    conn.setRequestProperty("Proxy-Authorization", authentication);

If you are using JDK 1.1, you may not be able to set the system properties. In that case, you should create the java.net.URL object with your proxy server information:

    url = new URL("http", PROXY_HOST,
                  Integer.parseInt(PROXY_PORT),
                  fileURL );
    // assumes authentication is not required

The home stretch

To start reading your file, you need to obtain the InputStream from the URLConnection (or URL) object. In this example, you wrap the InputStream with a BufferedInputStream.

If you are using the URLConnection, follow this code:

    BufferedInputStream  bis = new
        BufferedInputStream(conn.getInputStream());

If you are using the URL, follow this code:

    BufferedInputStream  bis = new
        BufferedInputStream(url.openStream());

Once you have done that, you simply write each byte from the InputStream to the servlet’s OutputStream:

    BufferedOutputStream bos = new 
        BufferedOutputStream(out);
    byte[] buff = new byte[2048];
    int bytesRead; 
    // Simple read/write loop.
    while(-1 != (bytesRead = bis.read(buff, 0, buff.length))) {
        bos.write(buff, 0, bytesRead);
    }

Lastly, you close the streams in a final block.

This example is implemented using the doPost method of a servlet that extends HttpServlet:

public void doPost(HttpServletRequest req, 
                   HttpServletResponse res)
   throws ServletException, IOException
{
    ServletOutputStream  out           = 
        res.getOutputStream ();
//---------------------------------------------------------------
// Set the output data's mime type
//---------------------------------------------------------------
    res.setContentType( "application/pdf" );  // MIME type for pdf doc
//---------------------------------------------------------------
// create an input stream from fileURL
//---------------------------------------------------------------
    String fileURL = 
        "
//------------------------------------------------------------
// Content-disposition header - don't open in browser and
// set the "Save As..." filename.
// *There is reportedly a bug in IE4.0 which  ignores this...
//------------------------------------------------------------
    res.setHeader("Content-disposition",
                  "attachment; filename=" +=
                  "Example.pdf" );
//-----------------------------------------------------------------
// PROXY_HOST and PROXY_PORT should be your proxy host and port
// that will let you go through the firewall without authentication.
// Otherwise set the system properties and use URLConnection.getInputStream().
//-----------------------------------------------------------------
    BufferedInputStream  bis = null; 
    BufferedOutputStream bos = null;
    try {
        URL url = new URL( "http", PROXY_HOST, 
                           Integer.parseInt(PROXY_PORT), fileURL  );
        // Use Buffered Stream for reading/writing.
        bis = new BufferedInputStream(url.openStream());
        bos = new BufferedOutputStream(out);
        byte[] buff = new byte[2048];
        int bytesRead;
        // Simple read/write loop.
        while(-1 != (bytesRead = bis.read(buff, 0, buff.length))) {
            bos.write(buff, 0, bytesRead);
        }
    } catch(final MalformedURLException e) {
        System.out.println ( "MalformedURLException." );
        throw e;
    } catch(final IOException e) {
        System.out.println ( "IOException." );
        throw e;
    } finally {
        if (bis != null)
            bis.close();
        if (bos != null)
            bos.close();
    }
}

Conclusion

As you can see, opening a non-HTML document from a servlet is pretty simple, even outside a firewall. You can use that same code to open image files or other types of multimedia files by setting the appropriate MIME type. Today more information is being made available via the Web, and much of that information is stored in formats other than HTML. Writing a servlet to render non-HTML documents through your Web browser is an easy and convenient way to provide information to your users, surpassing the limits of HTML.

Marla Bonar, a consultant at Greenbrier &
Russel in Phoenix, Ariz., has been programming in Java since the
days of JDK 1.0.2. She is a firm believer in object-oriented
architecture and design and the use of software patterns. She
became a software engineer through her father’s encouragement.

Source: www.infoworld.com