How to read excel 2003 (.xls) file using Apache POI?

Before you start with how to read excel 2003 file using apache poi, suggested pre-requisites to read –

Introduction to Apache POI

How to set up Apache POI in java – selenium project.

To start with the article, about how to read an excel file using Apache POI, Pre-rwquisite is, you have configured Apache POI in your project. If not, refer above link.
As we saw earlier, HSSF is the POI Project’s pure Java implementation of the Excel ’97(-2007) file format. Whereas XSSF is the POI Project’s pure Java implementation of the Excel 2007 OOXML (.xlsx) file format.
Let us understand about excel first.
First find out the excel file, known as Workbook here. Then within a workbook, we have multiple sheets. Go to the desired sheet using sheet name / sheet index. Then go to row & column position & get the cell data.
Below are the steps to Read excel sheet

     1.     Create object of the File, provide file path where your excel sheet is placed. Make sure that Excel file is in 2003 format (.xls format).

           2.       Get the file in FileInputStream which we will use for reading excel sheet.
           3.       Now use HSSFWorkbook class to get the desired workbook from FileInputStream.
              HSSFWorkbook wb = new HSSFWorkbook(ExcelFileToRead);
 
          4.       Get the required sheet using class HSSFSheet (of the above workbook).
               HSSFSheet sheet=wb.getSheetAt(0);
         5.       Now get the data from sheet using object created by HSSFSheet.
Look at the below example.
There are two ways
            1.       If the excel sheet has only String data (or one type of data)
            2.       If the excel sheet has combination of data like String /int/Boolean etc.
Consider the First point when Excel sheet has only one type of data. Below ScreenShot shows the input excel file.
Observe the below piece of code, which will be used to read excel file data (have only one type of data – String here)
package apachePOI;

import java.io.File;
import java.io.FileInputStream;

import org.apache.poi.hssf.usermodel.HSSFRichTextString;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;

public class Read2003FormatExcel {

 public static void main(String[] args) throws Exception {

  //location of your file - filepath
  String filepath = "C:UsersPrakashDesktopTestData.xls";
  
  //Create File object with above file path
  File srcFile = new File(filepath);
  
  //Create FileInputStream for above srcFile - to read file
  FileInputStream fis = new FileInputStream(srcFile);
  
  //Excel file is nothing but workbook, so create workbook object.
  HSSFWorkbook wb = new HSSFWorkbook(fis);
  
  //Go to the desired sheet, here index 0 means first tab
  HSSFSheet sheet = wb.getSheetAt(0);
  
  //get total number of rows & columns so that we can iterate over it.
  int TotalNumOfRows = sheet.getLastRowNum()+1;
  int TotalNumOfCols = sheet.getRow(0).getLastCellNum();
  
  //Iterate over the sheet, using total number of rows and columns
  for (int i = 0; i < TotalNumOfRows; i++ ){
   for (int j = 0; j < TotalNumOfCols; j++){
    
    //get the cell data using row number and column number of the cell.
    HSSFRichTextString cellData = sheet.getRow(i).getCell(j).getRichStringCellValue();   
    System.out.println(cellData);
   }
  }
  
 }

}
Now, in Another case, when Excel sheet contain all type of data, example – String / int / Boolean etc.

Observe below piece of code – to read data from excel sheet containing all type of data (can be used even if excel sheet has one type of data)
package apachePOI;

import java.io.File;
import java.io.FileInputStream;
import java.util.Iterator;

import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;

public class Read2003FormatExcelUsingIterator {

 public static void main(String[] args) throws Exception {

  //location of your file - filepath
  String filepath = "C:UsersPrakashDesktopTestData.xls";
  
  //Create File object with above file path
  File srcFile = new File(filepath);
  
  //Create FileInputStream for above srcFile - to read file
  FileInputStream fis = new FileInputStream(srcFile);
  
  //Excel file is nothing but workbook, so create workbook object.
  HSSFWorkbook wb = new HSSFWorkbook(fis);
  
  //Go to the desired sheet, here index 0 means first tab
  HSSFSheet sheet = wb.getSheetAt(0);
  
  //get total number of rows & columns so that we can iterate over it.
  int TotalNumOfRows = sheet.getLastRowNum()+1;
  int TotalNumOfCols = sheet.getRow(0).getLastCellNum();
  
  Iterator<Row> itr = sheet.rowIterator();
  while (itr.hasNext()){
   Row row = itr.next();
   
   Iterator<Cell> itrCell = row.cellIterator();
   
   while (itrCell.hasNext()){
    Cell cell =  itrCell.next();
    switch (cell.getCellType()){
    case Cell.CELL_TYPE_NUMERIC:
     System.out.println("cell type is Numeric: "+cell.getNumericCellValue());
     break;
    case Cell.CELL_TYPE_STRING:
     System.out.println("cell type is String: "+cell.getStringCellValue());
     break;
    
    }
   }
  }
  
 }

}
Hope This Helps !!!! do post your questions if any !!!

Suggested next article to read –     Writing Excel files using Apache POI

How to setup Apache POI in java – selenium Project.

In this article i will talk about how to setup apache POI in Selenium java project under eclipse.

Suggested post to read –  Apache POI Introduction

Setting up Apache POI in a java – selenium project is very easy. Again as like selenium, Apache POI is a suite of jar files. So, setting up Apache POI means adding these jar files to a project. Let us see how this can be achieved.
Step1: Download Apache POI.  
   You can download Apache POI from its official website (https://poi.apache.org/download.html). Latest version of Apache POI is 3.16 (a stable release) as of today.
To Download Apache POI – Click here. Version 3.16
Step2: Extract the downloaded POI jar files.
Once zip file is downloaded from above link, extract the jar files to some location.
Step 3: Create new Java project.
To know about how to create new project, refer step 2 of below post.

How to create java project in eclipse IDE

Step4: Add extracted jar files in to java – selenium project.
Follow below steps to know how can we add the jar files in to a project.
First, right click on project created in step 3 here.
Then Go to “Build path” -> “Configure Build Path”
 Then Click on “Add External Jar” option.
Then select all extracted jar files and click on apply and save.
That’s it, you have configured Apache POI in your java – selenium project. You can start using excel files in java -selenium project.
Follow next articles to know how to used Excel files in java – selenium project.
Refer –

How to read an excel 2003 (.xls) file using Apache POI

How to read an Excel 2007 (.xlsx) file using Apache POI

Writing Excel files using Apache POI

Apache POI Introduction

What is Apache POI?

Apache POI is the java API for Microsoft documents like excel, word, PowerPoint, outlook, Visio, Publisher etc.
Some time, while automating web application, it is expected to read the input data from excel file, Sometime it is expected to generate reports in excel / word etc. So, in order to deal with this, one needs Apache POI. – which is a java API for Microsoft documents.
It is used to create new Microsoft document (like excel / word etc), to modify them, to access data from excel file / word file by using java programs.

POI have various components for various Microsoft documents. Below are the components of POI.
Components of Apache POI. – 
Excel (SS=HSSF+XSSF+SXSSF) –
HSSF is the POI Project’s pure Java implementation of the Excel ’97(-2007) file format. XSSF is the POI Project’s pure Java implementation of the Excel 2007 OOXML (.xlsx) file format.
PowerPoint (SL=HSLF+XSLF) –
HSLF is the POI Project’s pure Java implementation of the Powerpoint ’97(-2007) file format.
Word (WP=HWPF+XWPF) –
HWPF is the name of our port of the Microsoft Word 97(-2007) file format to pure Java. It also provides limited read only support for the older Word 6 and Word 95 file formats.
The partner to HWPF for the new Word 2007 .docx format is XWPF. Whilst HWPF and XWPF provide similar features, there is not a common interface across the two of them at this time.
Outlook (HSMF) –
HSMF is the POI Project’s pure Java implementation of the Outlook MSG format.
Visio (HDGF+XDGF) –
HDGF is the POI Project’s pure Java implementation of the Visio binary (VSD) file format. XDGF is the POI Project’s pure Java implementation of the Visio XML (VSDX) file format.
Publisher (HPBF) –
HPBF is the POI Project’s pure Java implementation of the Publisher file format.
Official WebSite of Apache POI is: https://poi.apache.org/
It is very common to use Excel file for Automation purpose. Here is how to get started with using excel file for automation purpose / data driven testing.
1. Download the latest release of the library here: Apache POI – Download Release Artifacts
2. Extract the zip file and add the appropriate JAR files to your project’s classpath:
– If you are reading and writing only Excel 2003 format, only the file poi-VERSION.jar is enough.
– If you are reading and writing Excel 2007 format, you have to include the following files:
  • poi-ooxml-VERSION.jar
  • poi-ooxml-schemas-VERSION.jar
  • xmlbeans-VERSION.jar
For Automation testing using Selenium WebDriver, we do not need all these things. What we need mostly is, how to read / write excel sheet (2003 / 2007 format). We may not be concerned about other uses of Apache POI.
So, let us study apache POI with below articles.

      1.       How to setup Apache POI in java – selenium Project.

           2.       How to read an excel 2003 (.xls) file using Apache POI

           3.       How to read an Excel 2007 (.xlsx) file using Apache POI

           4.       Writing Excel files using Apache POI

Hope this helps !!!!