Excel to MySQL using Java

Target Audience: Java Developers, Database Developers
What should you know already? Basics of Java, JDBC

Excel to MySQL conversion can be done using LOAD DATA INFILE command as I discussed in Excel to MySQL blog. But that command is inadequate when you need to read the excel cell value and making some calculations or concatenations with it, then insert into your relational table. In this situation we need to read excel worksheet using programming languages and do the process before insert. Lets discuss this with Java.

Apache POI API

POI is a project from Apache to create Java API for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft’s OLE 2 Compound Document format (OLE2), simply talking, to read and write MS Excel files using Java. Check the references for POI API documentation.

Reading – Writing Excel

Horrible SpreadSheet Format (HSSF) API is for reading/writting Excel files using pure Java. Check the following list of classes from this API, which are needed to access excel file:

  • org.apache.poi.hssf.usermodel.HSSFCell
  • org.apache.poi.hssf.usermodel.HSSFRow
  • org.apache.poi.hssf.usermodel.HSSFSheet
  • org.apache.poi.hssf.usermodel.HSSFWorkbook
  • org.apache.poi.poifs.filesystem.POIFSFileSystem

Of above, POIFSFileSystem class is used to initiate the underlying operating system’s file structure. It requires an object of java.io.InputStream. An object of type POIFSFileSystem is given to HSSFWorkbook in order to retrieve the workbook. The following code snippet illustrates this:

File excelFile=new File(“location-of-excel-file/filename.xls”);
FileInputStream inStream=new FileInputStream(excelFile);
POIFSFileSystem fileSystem=new POIFSFileSystem(inStream);
HSSFWorkbook workBook = new HSSFWorkbook (fileSystem);

Once you create object for HSSFWorkbook, then you can access a work sheet by providing work sheet index number or its name (say sheet1). Suppose you want access very first work sheet, then the code would be:

/** getting sheet by its index number */
HSSFSheet workSheet=workBook.getSheetAt(0);

(or)

/** getting sheet by its name */
HSSFSheet workSheet=workBook.getSheet(“sheet1”);

The next step is to access rows and cells of each row. A work sheet consists of many rows but not all rows has content. A row has many cells. To simplify the access of rows and cells, we need to use java.util.Iterator with generic type reference HSSFRow and for accessing cells in a row, use another Iterator with generic type reference HSSFCell. Once you access a cell, you can read its content, style, cell number, etc. A cell may contain numerical value or alpha-numerics or booleans. The HSSFCell class has certain methods to check this. Look at the following code snippet:

Iterator rows=workSheet.rowIterator();
Iterator cells=null;
HSSFRow aRow=null;
HSSFCell aCell=null;
while(rows.hasNext()){
    aRow=rows.next();
    cells=aRow.cellIterator();
    while(cells.hasNext()){
      aCell=cells.next();
      switch(aCell.getCellType()){
        case HSSFCell.CELL_TYPE_NUMERIC:
          System.out.println(aCell.getStringCellValue());
          break;
        case HSSFCell.CELL_TYPE_STRING:
          System.out.println(aCell.getNumericCellValue());
          break;
      }
      }
    }
}

Once you could access the cell values like the above code, then there would be no problem for constructing insert query you require for your application. Use JDBC API to insert the calculated values of excel sheet row into your relational table.

References

Permanent link to this article: https://blog.openshell.in/2011/05/excel-to-mysql-using-java/

Leave a Reply

Your email address will not be published.