Apache POI Introduction

What is Apache POI?

Apache POI is the java API for Microsoft documents like excel, word, PowerPoint, outlook, Visio, Publisher etc.
Some time, while automating web application, it is expected to read the input data from excel file, Sometime it is expected to generate reports in excel / word etc. So, in order to deal with this, one needs Apache POI. – which is a java API for Microsoft documents.
It is used to create new Microsoft document (like excel / word etc), to modify them, to access data from excel file / word file by using java programs.

POI have various components for various Microsoft documents. Below are the components of POI.
Components of Apache POI. – 
Excel (SS=HSSF+XSSF+SXSSF) –
HSSF is the POI Project’s pure Java implementation of the Excel ’97(-2007) file format. XSSF is the POI Project’s pure Java implementation of the Excel 2007 OOXML (.xlsx) file format.
PowerPoint (SL=HSLF+XSLF) –
HSLF is the POI Project’s pure Java implementation of the Powerpoint ’97(-2007) file format.
Word (WP=HWPF+XWPF) –
HWPF is the name of our port of the Microsoft Word 97(-2007) file format to pure Java. It also provides limited read only support for the older Word 6 and Word 95 file formats.
The partner to HWPF for the new Word 2007 .docx format is XWPF. Whilst HWPF and XWPF provide similar features, there is not a common interface across the two of them at this time.
Outlook (HSMF) –
HSMF is the POI Project’s pure Java implementation of the Outlook MSG format.
Visio (HDGF+XDGF) –
HDGF is the POI Project’s pure Java implementation of the Visio binary (VSD) file format. XDGF is the POI Project’s pure Java implementation of the Visio XML (VSDX) file format.
Publisher (HPBF) –
HPBF is the POI Project’s pure Java implementation of the Publisher file format.
Official WebSite of Apache POI is: https://poi.apache.org/
It is very common to use Excel file for Automation purpose. Here is how to get started with using excel file for automation purpose / data driven testing.
1. Download the latest release of the library here: Apache POI – Download Release Artifacts
2. Extract the zip file and add the appropriate JAR files to your project’s classpath:
– If you are reading and writing only Excel 2003 format, only the file poi-VERSION.jar is enough.
– If you are reading and writing Excel 2007 format, you have to include the following files:
  • poi-ooxml-VERSION.jar
  • poi-ooxml-schemas-VERSION.jar
  • xmlbeans-VERSION.jar
For Automation testing using Selenium WebDriver, we do not need all these things. What we need mostly is, how to read / write excel sheet (2003 / 2007 format). We may not be concerned about other uses of Apache POI.
So, let us study apache POI with below articles.

      1.       How to setup Apache POI in java – selenium Project.

           2.       How to read an excel 2003 (.xls) file using Apache POI

           3.       How to read an Excel 2007 (.xlsx) file using Apache POI

           4.       Writing Excel files using Apache POI

Hope this helps !!!!

Leave Comment

Your email address will not be published. Required fields are marked *

Looking for learning Framework Development from Scratch? Lookout for Detailed Framework Development videos on YouTube here -

https://www.youtube.com/automationtalks

Get the Framework code at Github Repo: https://github.com/prakashnarkhede?tab=repositories