Related Topics: XML Magazine

XML: Article

Generating Preformatted Reports in Excel

Generating Preformatted Reports in Excel

It's quite common for developers to present data from the database (or elsewhere) for analysis by more than one team in an organization. Over the years, Microsoft's Excel has been one of the preferred tools for spreadsheet-type reports. Now, in combination with XML technologies, we can control the formatting of the spreadsheets.

With Web-enabled applications so popular, it's customary to give end users the ability to generate the reports themselves. This article discusses the automatic generation of preformatted reports. We'll build an application that has a menu of the available reports or report-generation criteria and the ability to generate preformatted reports in Excel using XML and XSLT.

Interim Solution
Given that Excel can read from a character-delimited file, we could pull out the data from the database and save it in a TAB DELIMITED text file with a .xls extension. The file could be saved on a Web server and we could provide a link to it for any interested teams. This would enable the teams to get the reports whenever they want, but this solution presents some serious concerns.

Drawbacks of the Interim Solution
For example, the report isn't preformatted. The column widths are the same, so the user has to change the widths manually to get a better view. Features that could be incorporated are missing, such as titles in bold, important data marked in different colors, text wrapping for bigger text data, headers, and footers.

In essence, it is simply unformatted data that could be viewed in an Excel sheet. The end user has to go through all the formatting before the report can be printed. And since these reports have to be dynamic, the formatting process has to be repeated every time the reports are generated.

Another drawback is that all the teams see the data in the same format. If one of the teams expects the data in a different format, they have to change it manually or we have to create different applications to suit the different requirements.

XML Feature of Excel 2002 (Office XP)
Office XP includes Excel 2002, which provides support for XML. In other words, Excel 2002 can save data as an XML document as well as read data from an XML document. But Excel can't read from or write to just any XML document; it requires the XML data in a specific format - XML spreadsheet format.

How Can We Use This Feature?
First we get the raw data for the report as an XML document. We then create an XSL stylesheet to convert the data to another XML document (the XML spreadsheet format) that can be read by Excel 2002. Our report is now an XML document in Excel's XML spreadsheet format, and will have all the data and related formatting information. In order to do this we have to understand Excel's XML spreadsheet format. Let's take a look at some basic techniques that Excel uses to save the data and formatting information in XML spreadsheet format.

How Excel 2002 Saves/Reads Data from XML
Excel 2002 saves the data using the XML Spreadsheet Schema (XML-SS). So an Excel document that's saved as an XML spreadsheet has all its data and formatting information stored as XML elements that use the XML-SS.

The <Workbook> element is the parent of all. It contains the <Worksheet> elements for each worksheet. Each sheet has an associated <Table> element. Within the table each row is stored as a element, the cell is stored as a element, the data in the cell is stored as a element, and so on. This structure takes care of storing all the data.

To store the formatting information, Excel's XML spreadsheet uses the <Style> element. A unique ID is generated for every unique combination of formatting information. The styling information is stored within this <Style> element. This style ID is referenced whenever this particular type of formatting is required.

Here's an example of a style element having the following formatting characteristics:

  • Alignment set to Center
  • Text size set to 16
  • Font set to Courier
  • Text style set to Bold


<Alignment ss:Horizontal=
"Center" ss:Vertical="Bottom" />
<Font ss:FontName="Courier" x:Family=
"Modern" ss:Size="16" ss:Bold="1" />
</Style>

Note that this particular style element has its ID set to s26. Whenever we require the formatting characteristics listed above, a reference will be made to the style element shown above. Here is an example of the usage (note the reference to StyleID S26):

<Cell ss:Index="2" ss:StyleID="s26">
<Data ss:Type="String">Report Title
</Data>
</Cell>

As a result of this, the data Report Title will be displayed on the Excel sheet with the styling information listed earlier.

While we can create the <Style> elements ourselves, an easier option would be to save an Excel sheet with the desired formatting as an XML spreadsheet and pick up the <Style> elements from there. We'll discuss this trick in detail later.

Thus the <Style> elements take care of the formatting information. The Excel sheet in the XML spreadsheet format is basically data as a value of specific XML elements; the attributes of XML elements store the formatting information.

Other features of Excel are stored in the XML spreadsheet format using a similar technique involving various other XML elements.

To get more information about how Excel saves the data in an XML spreadsheet format, please follow the link to the Microsoft XML Spreadsheet Schema given in the "References" section.

I'll illustrate the report-generation process with a simple example of a preformatted library inventory report.

Desired Excel Template
First we create a template for our report. The template helps us to know about the expected formatting requirements (see Figure 1).

Raw Data in XML
As mentioned earlier, the raw data has to be available in an XML document because we want to convert it to another XML document via XSLT. This raw data in XML format is the input to the XSLT process.

If the raw data is in a database or file, it can be converted to XML using various techniques. One of the techniques would be to use Java and DOM, but that is beyond the scope of this discussion. In this case we'll just assume that the data is available to us in the XML format. The data in our example is a short list of books that are available in a fictitious library. Listing 1 shows the data in XML.

Creating the XSL
Creating the XSL is the major task in our process. We have to convert the data in the XML document to another XML document that Excel can read from. This new XML document will have all the data along with the formatting information. We achieve this with XSLT (XSL Transformation).

For this we have to create an XSL document, a onetime process. Once created, the XSL can be used to process any data in the XML document. (Please follow the links in the "References" section for related information on XSLT.) A lot of tools are available to generate the XSL documents. I used XML Spy.

In the XSL document we use XPath to navigate to various parts of the input XML document, that is, to locate a particular element or attribute in the input XML document. Please follow the links in the "References" section for related information on XPath.

Listing 2 gives part (template) of the XSL document. (The entire document can be found in the final code listing.)

Let's take a look at the XSL template. The first element in this XSL is for the title of the report. We associate the ReportTitle attribute of the Report element in the input XML document with this row. This association will display the value of the ReportTitle attribute in that particular cell. Note that we start from the row with Index=3, which gives us two blank rows in the beginning.

Note also the use of the ss:MergeAcross attribute; this is used to represent merging cells. You can learn more about the Microsoft Spreadsheet Schema (XML:SS) by following the links in the "References" section.

Moving along, the next element with Index=5 is for the column headings. After the column headings row, we display the contents of each element from the input XML document. Note the use of different StyleIDs associated with the <Cell> elements. A simple trick will get all these <Style> elements created for you automatically.

The Trick
The Excel 2002 report template with all our desired formatting features can be created and saved in the XML spreadsheet format. It will have <Style> elements (with unique style IDs) for all the unique combinations of formatting requirements. We can now use these <Style> elements in our XSL stylesheet. All we have to do then is to associate a particular <Cell> element in the XSL stylesheet with the appropriate <Style> elements.

This procedure can also be used to learn how several other features of Excel are stored in the XML spreadsheet format and we can use them in our XSL stylesheet.

Getting the Formatted Report (XSL Transformation)
Once the XSL is ready, we have to perform XSL transformation in which the XSL document is used to process the input XML document and create another XML document as output. There are a number of XSL processors available to do this. I used IBM's LotusXSL processor.

Listing 3 is a very simple Java code used to achieve the transformation. It accepts the XML and XSL file names as inputs and generates a new XML document as output. This output file can be saved as an XML document and can be opened with Excel 2002.

We can save the generated XML file on the Web server (with a xls extension) and provide a link to this file for users. Figure 2 is how the report looks when opened in Excel 2002. (Please refer to "Issues and Possible Workarounds" if you want to view the report in other versions of Excel.)

Possible Enhancements
My example illustrates the use of very basic formatting features. Other features of Excel - headers, footers, page numbers, page breaks, formulas, inserting current date, etc. - could be included as well. Using the trick discussed earlier, we can include these features in our XSL document.

We can also have different XSL sheets to present the same data in different formats. For example, one team might not want the "ReceivedDate" column in their report, so what we do is to create a different XSL sheet and that team would get a report in another format. Thus we can give a choice of format on our report-generation menu page.

Issues and Possible Workarounds
If we Web-enabled this report application (so the end user could open it in a browser), then all the users should have Office XP (Excel 2002) installed to view it. Since we can't force them to do this, a possible workaround would be to save the document to an earlier Excel version and then provide a link to the report. This could be done easily with Visual Basic 6.0 (using the Excel 2002 Objects). All we have to do here is execute a simple Save As (old format) command. In this case we'd have to install Office XP only on the server.

References

More Stories By Prasad Joshi

I have been working in the IT industry since the early 2000s. I have worked on a wide variety of technologies and across a wide range of industries. I am unbelievably enthusiastic about Cloud, specifically AWS. As part of Cloud Clicks Consulting, I help businesses develop their cloud migration strategies and also help them with the actual migrations. I also help businesses build new solutions on the cloud.

Comments (1)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.