Java表格处理-两个excel之间的数据拷贝

POI介绍

ApachePOI是用Java编写的免费开源的跨平台的JavaAPI,ApachePOI提供API给Java程序对MicrosoftOffice格式档案读和写的功能,其中使用最多的就是使用POI操作Excel文件。

maven坐标依赖

1
2
3
4
5
6
7
8
9
10
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
</dependency>

<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
</dependency>

POI结构:

1
2
3
4
5
6
7
HSSF-提供读写MicrosoftExcelXLS格式档案的功能
XSSF-提供读写MicrosoftExcelOOXMLXLSX格式档案的功能
HWPF-提供读写MicrosoftWordDOC格式档案的功能
HSLF-提供读写MicrosoftPowerPoint格式档案的功能
HDGF-提供读MicrosoftVisio格式档案的功能
HPBF-提供读MicrosoftPublisher格式档案的功能
HSMF-提供读MicrosoftOutlook格式档案的功能

关于POI中基础API的使用,就不多介绍了,本文主要介绍对excel的处理-两个excel之间的数据拷贝

话不多说,现在开始!!

单元格样式个数限制

首先介绍一下POI中excel的单元格样式,工作簿Workbook中单元格样式个数是有限制的,所以在 程序中应该重复使用相同CellStyle,而不是为每个单元 格创建一个CellStyle

1
2
HSSFCellStyle - 4000个
XSSFCellStyle - 64000个
最大行数 最大列数 单元格样式个数 文本单元格最大长度 一个单元格上条件样式最大个数
HSSF 65536 256(2^8) 4000 32767 3
XSSF 1048576 16384(2^14) 64000 Integer.MAX_VALUE

其实在WorkbookcreateCellStyle()方法中,可以看到各个表格格式实现类对单元格样式的限制

HSSFCellStyle

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
   /**
* The maximum number of cell styles in a .xls workbook.
* The 'official' limit is 4,000, but POI allows a slightly larger number.
* This extra delta takes into account built-in styles that are automatically
*
* See http://office.microsoft.com/en-us/excel-help/excel-specifications-and-limits-HP005199291.aspx
*/
private static final int MAX_STYLES = 4030;

/**
* Create a new Cell style and add it to the workbook's style table.
* You can define up to 4000 unique styles in a .xls workbook.
*
* @return the new Cell Style object
* @throws IllegalStateException if the number of cell styles exceeded the limit for this type of Workbook.
*/
@Override
public HSSFCellStyle createCellStyle()
{
if(workbook.getNumExFormats() == MAX_STYLES) {
throw new IllegalStateException("The maximum number of cell styles was exceeded. " +
"You can define up to 4000 styles in a .xls workbook");
}
ExtendedFormatRecord xfr = workbook.createCellXF();
short index = (short) (getNumCellStyles() - 1);
return new HSSFCellStyle(index, xfr, this);
}

HSSFCellStyle

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
    /**
* Excel2007
*
* <ul>
* <li>The total number of available rows is 1M (2^20)</li>
* <li>The total number of available columns is 16K (2^14)</li>
* <li>The maximum number of arguments to a function is 255</li>
* <li>Number of conditional format conditions on a cell is unlimited
* (actually limited by available memory in Excel)</li>
* <li>Number of cell styles is 64000</li>
* <li>Length of text cell contents is 32767</li>
* <ul>
*/
EXCEL2007(0x100000, 0x4000, 255, Integer.MAX_VALUE, 64000, 32767);

// Is this right? Number formats (XSSFDataFormat) and cell styles (XSSFCellStyle) are different. What's up with the plus 1?
private static final int MAXIMUM_STYLE_ID = SpreadsheetVersion.EXCEL2007.getMaxCellStyles();

public XSSFCellStyle createCellStyle() {
if (this.getNumCellStyles() > MAXIMUM_STYLE_ID) {
throw new IllegalStateException("The maximum number of Cell Styles was exceeded. You can define up to " + MAXIMUM_STYLE_ID + " style in a .xlsx Workbook");
} else {
int xfSize = this.styleXfs.size();
CTXf xf = org.openxmlformats.schemas.spreadsheetml.x2006.main.CTXf.Factory.newInstance();
xf.setNumFmtId(0L);
xf.setFontId(0L);
xf.setFillId(0L);
xf.setBorderId(0L);
xf.setXfId(0L);
int indexXf = this.putCellXf(xf);
return new XSSFCellStyle(indexXf - 1, xfSize - 1, this, this.theme);
}
}

所以,如果在拷贝两个Workbook的数据时,数据单元格数量过多,频繁使用createCellStyle()方法创建单元格样式,会导致单元格样式超出限制。

所以,为了解决单元格样式限制问题,可以将样式缓存起来,具体方法可见如下代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
package com.huoli.ctar.cms.poi;

import lombok.Data;
import org.apache.poi.hssf.usermodel.HSSFCellStyle;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.CellRangeAddress;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.text.SimpleDateFormat;
import java.util.*;

/**
* Author: caijc
* Date: 2022/1/5 10:28
* Description:
*/
@Data
public class POICopyHandler {

// 缓存单元格的样式索引,key是拷贝原表格的样式索引id,value是拷贝目标表格的样式id
Map<Short,Short> styleMap = new HashMap<>();
private Workbook fromWorkbook;
private Workbook toWorkbook;
private SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");

// 构造方法,将原表格的所有样式一并拷贝到目标表格,并将表格样式索引缓存
public POICopyHandler(Workbook fromWorkbook, Workbook toWorkbook) {
this.fromWorkbook = fromWorkbook;
this.toWorkbook = toWorkbook;
Sheet fromSheet = fromWorkbook.getSheetAt(0);
Sheet toSheet = fromWorkbook.getSheetAt(0);
if (toSheet == null){
toWorkbook.createSheet();
}
for (int i = fromSheet.getFirstRowNum(); i <= fromSheet.getLastRowNum(); i++) {
Row fromRow = fromSheet.getRow(i);
for (int j = fromRow.getFirstCellNum(); j < fromRow.getPhysicalNumberOfCells(); j++) {
Cell fromCell = fromRow.getCell(j);
CellStyle fromStyle = fromCell.getCellStyle();
if(!styleMap.containsKey(fromStyle.getIndex())){
CellStyle toStyle = toWorkbook.createCellStyle();
toStyle.cloneStyleFrom(fromStyle);
styleMap.put(fromStyle.getIndex(), toStyle.getIndex());
}
}
}

}

/**
* 复制行
* 此方法主要用于复制2个不同Workbook间的行
*/
public void copySingleRows(int fromSheetIndex, int toSheetIndex, int fromRowIndex, int toRowIndex) {
Sheet fromSheet = fromWorkbook.getSheetAt(fromSheetIndex);
Sheet toSheet = toWorkbook.getSheetAt(toSheetIndex);

if ((fromRowIndex == -1) || (fromRowIndex == -1)) {
return;
}

// 设置列宽
Row fromRow = fromSheet.getRow(fromRowIndex);
if (fromRow == null) {
return;
}
for (int i = fromRow.getLastCellNum(); i >= fromRow.getFirstCellNum(); i--) {
toSheet.setColumnWidth(i, fromSheet.getColumnWidth(i));
toSheet.setColumnHidden(i, false);
}
// 拷贝行并填充数据
Row toRow = toSheet.createRow(toRowIndex);
toRow.setHeight(fromRow.getHeight());
for (int j = fromRow.getFirstCellNum(); j <= fromRow.getPhysicalNumberOfCells(); j++) {
Cell fromCell = fromRow.getCell(j);
Cell toCell = toRow.createCell(j);
if (fromCell != null) {
CellStyle fromStyle = fromCell.getCellStyle();
CellStyle toStyle = toWorkbook.getCellStyleAt(styleMap.get(fromStyle.getIndex()));
if (toStyle != null){
toCell.setCellStyle(toStyle);
}
switch (fromCell.getCellTypeEnum()) {
case BOOLEAN:
toCell.setCellValue(fromCell.getBooleanCellValue());
break;
case ERROR:
toCell.setCellErrorValue(fromCell.getErrorCellValue());
break;
case FORMULA:
toCell.setCellFormula(parseFormula(fromCell.getCellFormula()));
break;
case NUMERIC:
if(DateUtil.isCellDateFormatted(fromCell)){
toCell.setCellValue(sdf.format(fromCell.getDateCellValue()));
} else{
toCell.setCellValue(fromCell.getNumericCellValue());
}
break;
case STRING:
toCell.setCellValue(fromCell.getRichStringCellValue());
break;
}
}

}
}

private String parseFormula(String pPOIFormula) {
final String cstReplaceString = "ATTR(semiVolatile)"; //$NON-NLS-1$
StringBuffer result;
int index;

result = new StringBuffer();
index = pPOIFormula.indexOf(cstReplaceString);
if (index >= 0) {
result.append(pPOIFormula, 0, index);
result.append(pPOIFormula.substring(index + cstReplaceString.length()));
} else {
result.append(pPOIFormula);
}

return result.toString();
}
}

调用示例:

1
2
POICopyHandler poiCopyHandler = new POICopyHandler(fromWorkbook, toWorkbook);
poiCopyHandler.copySingleRows(0,0,0,0);// 四个参数分别是:源Workbook的sheet索引,目标Workbook的sheet索引,源Workbook的行索引,目标Workbook的行索引