2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2020

04/26/2004: POI Optimization - eliminating trailing empty rows from an HSSFSheet.

While my spreadsheet has only 7 rows with data, POI creates over 65,000 rows in the HSSFSheet object. This leads to a large amount of essentially unused memory. In order to free that memory, the following code snippet from my version of the public HSSFWorkbook(POIFSFileSystem fs, boolean preserveNodes) method works from bottom towards the top of the sheet removing empty rows. The code in bold performs the optimization.

            HSSFSheet hsheet = new HSSFSheet(workbook, sheet);

            boolean stop = false;
            boolean nonBlankRowFound;
            short c;
            HSSFRow lastRow = null;
            HSSFCell cell = null;

            while (stop == false) {
                nonBlankRowFound = false;
                lastRow = hsheet.getRow(hsheet.getLastRowNum());
                for (c = lastRow.getFirstCellNum(); c <= lastRow.getLastCellNum(); c++) {
                    cell = lastRow.getCell(c);
                    if (cell != null && lastRow.getCell(c).getCellType() != HSSFCell.CELL_TYPE_BLANK) {
                        nonBlankRowFound = true;
                    }
                }
                if (nonBlankRowFound == true) {
                    stop = true;
                } else {
                    hsheet.removeRow(lastRow);
                }
            }

            sheets.add(hsheet);

04/22/2004: POI Optimization - Speeding up org.apache.poi.hssf.usermodel.HSSFSheet.setPropertiesFromSheet()

This setPropertiesFromSheet() method potentially logs a lot of information. The logging calls are expensive because of the objects created when assembling the log messages. Using the check() method of the logging sub-system can prevent object creation when debugging is turned off. This change reduced execution time of my test case by 14.2 seconds and prevented the creation of 781,632 objects.

    /**
     * used internally to set the properties given a Sheet object
     */
    private void setPropertiesFromSheet(Sheet sheet) {
        long timestart = 0; // only used for debugging.
        int sloc = sheet.getLoc();
        RowRecord row = sheet.getNextRow();

        while (row != null) {
            createRowFromRecord(row);
            row = sheet.getNextRow();
        }
        sheet.setLoc(sloc);
        CellValueRecordInterface cval = sheet.getNextValueRecord();

        if (log.check(DEBUG)) {
            timestart = System.currentTimeMillis();
            log.log(DEBUG, "Time at start of cell creating in HSSF sheet = ", new Long(timestart));
        }
        HSSFRow lastrow = null;

        while (cval != null) {
            long cellstart = System.currentTimeMillis();
            HSSFRow hrow = lastrow;

            if ((lastrow == null) || (lastrow.getRowNum() != cval.getRow())) {
                hrow = getRow(cval.getRow());
            }
            if (hrow != null) {
                lastrow = hrow;
                if (log.check(DEBUG)) {
                    log.log(DEBUG, "record id = " + Integer.toHexString(((Record) cval).getSid()));
                }
                hrow.createCellFromRecord(cval);
                cval = sheet.getNextValueRecord();
                if (log.check(DEBUG)) {
                    log.log(DEBUG, "record took ", new Long(System.currentTimeMillis() - cellstart));
                }
            } else {
                cval = null;
            }
        }
        if (log.check(DEBUG)) {
            log.log(DEBUG, "total sheet cell creation took ", new Long(System.currentTimeMillis() - timestart));
        }
    }