请教各位老师，如何将文档中的表格提出到excel中

wtujcf123 · 发表于 2022-4-23 15:39

各位老师，如何批量将类似这样的文件中的这个表格内容提取到excel中呢。
微信图片_20220423152426.png

zpy2 · 发表于 2022-4-23 20:56

可以解压后，用正则表达式拆分表格。

wtujcf123 · 发表于 2022-4-23 21:49

zpy2 发表于 2022-4-23 20:56
可以解压后，用正则表达式拆分表格。

谢谢老师，我也在找寻通过word xml 里的节点进行提取的方法，谢谢老师。
写的非常好。
老师，您是在手机上编辑的吗，能麻烦你把源码分享吗，我再好好学习下。

zpy2 · 发表于 2022-4-25 10:20

wtujcf123 发表于 2022-4-23 21:49
谢谢老师，我也在找寻通过word xml 里的节点进行提取的方法，谢谢老师。
写的非常好。
老师，您是在手 ...

<?php
function extract_tbl_frm_doc($document){
$out="";
//$document=file_get_contents("/storage/emulated/0/Download/求助-如何提出表格内容/求助-如何提出表格内容/document.xml");
/*
$patten="~<w:tbl>~";
$tbls=preg_split($patten,$document);
$table_two=$tbls[3];
*/
$patten="~<w:tr[^>]*>(.*?)</w:tr>~";

//preg_match_all($patten,$table_two,$matches);
preg_match_all($patten,$document,$matches);
$rows=$matches[1];
foreach($rows as $row){

$patten="~<w:t>(.*?)</w:t>~";
preg_match_all($patten,$row,$matches);
echo html_entity_decode(implode("\t",$matches[1])."\r\n");
$out.=html_entity_decode(implode("~",$matches[1])."\r\n");

$patten="~<w:t\s[^>]*>(.*?)</w:t>~";
preg_match_all($patten,$row,$matches);
echo html_entity_decode(implode("\t",$matches[1])."\r\n");
$out.=html_entity_decode(implode("~",$matches[1])."\r\n");

}
return $out;
}
/*
$document_file="/storage/emulated/0/Download/htdocs/ceshi/extract_doc/docx2/document.xml";
$document=file_get_contents($document_file);
$out=extract_tbl_frm_doc($document);
file_put_contents('输出.txt',$out);
*/

wtujcf123 · 发表于 2022-4-25 17:45

本帖最后由 wtujcf123 于 2022-4-25 23:18 编辑

zpy2 发表于 2022-4-25 10:20

老师，这是什么语言呢？没帖源码之前，还以为是python呢。

aman1516 · 发表于 2022-5-10 19:30

定位WORD表格的“单元格”－－行列坐标
https://club.excelhome.net/thread-1397725-1-1.html

权作参考

wtujcf123 · 发表于 2022-10-28 20:38

谢谢了，。

		自动登录	找回密码
密码			免费注册

[求助] 请教各位老师，如何将文档中的表格提出到excel中