ExcelHome技术论坛

 找回密码
 免费注册

QQ登录

只需一步,快速开始

快捷登录

搜索
EH技术汇-专业的职场技能充电站 妙哉!函数段子手趣味讲函数 Excel服务器-会Excel,做管理系统 效率神器,一键搞定繁琐工作
HR薪酬管理数字化实战 Excel 2021函数公式学习大典 Excel数据透视表实战秘技 打造核心竞争力的职场宝典
让更多数据处理,一键完成 数据工作者的案头书 免费直播课集锦 ExcelHome出品 - VBA代码宝免费下载
用ChatGPT与VBA一键搞定Excel WPS表格从入门到精通 Excel VBA经典代码实践指南
查看: 1630|回复: 1

[求助]关于文本文件分割

[复制链接]

TA的精华主题

TA的得分主题

发表于 2007-6-22 12:49 | 显示全部楼层 |阅读模式

因在网站上查询基因序列后 保存下的文件为一个总文件
为方便分析与查找 总文件要分割成小文件 
文件分割器不能按自己的要求分割
就只能自己编写程序...
汗 不会 呵呵 只有想各位求援了
具体序列放附件里面了 
要分割为单个的 如:
>gi|1532267|gb|U68521.1|HIVU68521 HIV-1 sample 9939 patient 10 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAGTTAGATAAATGGGAAAGAATTCGGTTAAGGCCAG
GGGGAAAGAAAAAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAA
TCCTGGCCTTTTAGAAACATCAGAAGGCTGTAGACAAATACTGGGACAGCTACAACCAGCCCTTCAGACA
GGATCAGAAGAACTTAAATCATTACATAATACAGTAGCAGTCCTCTATTGTGTGCATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGAGAAAATAGAGGAAGAGCAAAACAAAAGTAAGAAAAAAGCACAGCA

晕 不能上传 
序列
>gi|1532267|gb|U68521.1|HIVU68521 HIV-1 sample 9939 patient 10 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAGTTAGATAAATGGGAAAGAATTCGGTTAAGGCCAG
GGGGAAAGAAAAAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAA
TCCTGGCCTTTTAGAAACATCAGAAGGCTGTAGACAAATACTGGGACAGCTACAACCAGCCCTTCAGACA
GGATCAGAAGAACTTAAATCATTACATAATACAGTAGCAGTCCTCTATTGTGTGCATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGAGAAAATAGAGGAAGAGCAAAACAAAAGTAAGAAAAAAGCACAGCA
AGCAGCAGCTGACACAGGAAACAACAGCCAGGTCAGCCAAAATTACCCTATAGTGCAGAACCTTCAGGGG
CAAATGGTA

>gi|1532263|gb|U68519.1|HIVU68519 HIV-1 sample 256 patient 9 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAG
GGGGAAAGAAAAAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAA
TCCTAGCCTTTTAGAGACATCAGAAGGCTGTAGACAAATACTGGGACAGCTACAACCGGCCCTTCAGACA
GGATCAGAAGAACTTAAATCATTACATAATACAGTAGCAGTCCTCTATTGTGTGCATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGAGAAGATAGAGGAAGAGCAAAACAAAAGCAAGAAAAAGGCACAGCA
AGCAGCAGCTGACACAGGAAACAACAGCCAGGTCAGCCAAAATTACCCTATAGTGCAGAACCTCCAGGGG
CAAATGGTA

>gi|1532259|gb|U68517.1|HIVU68517 HIV-1 sample 159 patient 8 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAATTAGATAGATGGGAAAAAATTCGGTTAAGGCCAG
GGGGAAAGAAAAAATATAAATTAAAACATATAGTATGGGCAAGAAGGGAGCTAGAACGATTCGMAGTTAA
TCCTGGCCTTTTAGAGACATCAGAAGGTTGTAGACAAATACTGGGACAGCTACAGCCATCCCTTCAGACA
GGATCAGAAGAACTTAAATCATTACATAATACAGTAGCAGTCCTCTATTGTGTGCATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGACAAAATAGAGGAAGAGCAAAACAAAAGTAAGAAAAAAGCACAGCA
AGCAGCAGCTGACACAGGAAACAACAGCCAGGTCAGTCAAAATTACCCTATAGTGCAGAACCTTCAGGGG
CAAATGGTA

>gi|1532257|gb|U68516.1|HIVU68516 HIV-1 sample 6760 patient 7 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAG
GGGGAAGGAAAAAGTATAAATTAAAACATCTAGTATGGGCAAGCAGGGAGCTAGAACGATATGCAGTTAA
TCCTGGCCTTTTAGAGACATCAGAAGGCTGTAGACAAATATTAGGACAGCTACAACCAGCCATTCAGACA
GGATCAGAAGAACTTAAATCATTATATAATACAGTAGTAACCCTCTACTGTGTGCATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGACAAGGTAGAGGAAGAACAAAACAAAAGTAAGAAAAAAGCACAGCA
AGCAGCAGCTGACACAGGAAACAGCGGCAAGGTCAGCCAAAATTTCCCTATAGTGCAGAACCTACAGGGG
CAAATGGTA

>gi|1532255|gb|U68515.1|HIVU68515 HIV-1 sample 6767 patient 6 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTACGGCCAG
GGGGAAAGAAAAAATATCAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAA
TCCTGGCCTTTTAGAGACATCAGAAGGCTGTAGACAAATATTGGGACAGTTACAACCATCCCTTCAGACA
GGATCAGAAGAACTTAAATCATTATATAATACAGTAGCAACCCTCTATTGTGTGCATCAAAAGATAGATA
TAAAAGACACCAAGGAAGCTTTAGAGAAGATAGAGGAAGAGCAAAACAAATGTAAGAAAAAGGCACAGCA
AGCCGCTGCTAACACAGGAAGCAGCAGCCAGGTCAGCCAAAATTACCCTATAGTGCAGAACCTCCAGGGG
CAAATGGTA

>gi|1532253|gb|U68514.1|HIVU68514 HIV-1 sample 317 patient 5 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTACTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAG
GGGGAAAGAAAAAATATCAATTAAAACATCTAGTATGGGCAAGCAGGGAGCTAGAACGATTCTCAATTAA
TCCTGGTCTTTTAGAGACATCAGAAGGCTGTAGACAAATATTGAGACAGCTACAACCATCCCTTCAGACA
GGATCAGAAGAACTTAAATCATTATATAATACAGTAGCAGTCCTCTATTGTGTGCATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGAAAAGATAGAGGAAGAGCAAAACAAAAGTAAGAAAAAGGCACAGCA
AGCAGCAGCTGACACAGGAAACAGCAGCCAGGTCAGCCAAAATTACCCTATAGTGCAGAACCTCCAGGGG
CAAATGGTA

>gi|1532249|gb|U68512.1|HIVU68512 HIV-1 sample 105 patient 3 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAG
GGGGAAAGAAAAAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAA
TCCTGGCCTTTTAGAGACATCAGAAGGCTGTAGACAAATATTGGGACAGCTACAACCAGCCCTTCAGACA
GGATCAGAAGAACTTAGATCATTATATAATACAGTAGCAACCCTCTATTGTGTACATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGAGAAGATAGAGGAAGAGCAAAACAAAAGTAAGAAAAAAGCACAGCA
AGCAGCAGCTGACACAGGAAGCAGCAGCCAGGTCAGCCAAAATTACCCTATAGTGCAGAACTTACAGGGG
CAAATGGTA

>gi|1532247|gb|U68511.1|HIVU68511 HIV-1 sample 135 patient 2 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAG
GGGGAAAGAAAAAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAA
TCCTGGCCTTTTAGAGACATCAGAAGGCTGTAGACAAATATTGGGACAGCTACAACCAGCCCTTCAGACA
GGATCAGAAGAACTTAAATCATTATATAATACAGTAGCAACCCTCTATTGTGTACATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGAGAAGATAGAGGAAGAGCAAAACAAAAGTAAGAAAAAGGCACAGCA
AGCAGCAGCTGACACAGGAAGCAGCAGCCAGGTCAACCAAAATTACCCTATAGTGCAGAACTTACAGGGG
CAAATGGTA

>gi|1532243|gb|U68509.1|HIVU68509 HIV-1 sample 136 patient 1 from Sweden matrix protein (gag) gene, p17 region, partial cds
ATGGGTGCGAGAGCGTCRGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAG
GGGGAAAGAAAAAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAA
TCCTGGCCTTTTAGAGACATCAGAAGGCTGTAGACAAATATTGGGACAGCTACAACCATCCCTTCAGACA
GGATCAGAAGAACTTAGATCATTATATAATACAGTAGCAGTCCTCTATTGTGTGCATCAAAGGATAGATG
TAAAAGACACCAAGGAAGCTTTAGAGAAGATAGAGGAAGAGCAAAACAAAAGTAAGAAAAAGGCACAGCA
AGCAGCAGCCGCAGCAGCTGACACAGGAAACAGCAGYCAGGTCAGCCAAAATTACCCTATAGTGCAGAAC
CTACAGGGGCAAATGGTA


写了个批处理的  因为这个分割在上100个序列以上时候反映时间太慢 希望大家帮忙修改下 让它更精简方便 

@Echo Off
color 0A
Echo    序列分析中,根据计算机配置不同和序列长短不一,可能需要几分钟,请耐心等待。。。
Echo ================================================================================
for /f "delims=| tokens=1,2*" %%i in (sequences.fasta) do (if "%%j"=="" (echo %%i %%i) else (echo %%i%%j %%i^|%%j^|%%k))>>a
setlocal EnableDelayedExpansion
for /f "delims= tokens=1*" %%i in (a) do (
    set line=%%i
    set line=!line:^>=p !
    for /f "tokens=1,2" %%s in ("!line!") do if "%%s"=="p" (echo %%s %%t %%i%%j) else (echo %%i))>>b
echo y>c
for /f "eol=; tokens=1,2,3*" %%i in (b) do if "%%i"=="p" (type c>>d & echo y %%j>c & echo %%i %%l>>d) else (echo w %%i)>>d
type c>>d
for /f "skip=1 tokens=1,2*" %%i in (d) do if "%%i"=="y" (ren temp %%j.txt) else ((if "%%k"=="" (echo %%j) else (echo %%j %%k))>>temp)
del a b c d

TA的精华主题

TA的得分主题

发表于 2007-6-24 21:28 | 显示全部楼层
[广告] Excel易用宝 - 提升Excel的操作效率 · Excel / WPS表格插件       ★免费下载 ★       ★ 使用帮助
文本文件怎么分割不行?等你终于调试好分割程序,我早手工分割完毕了。
您需要登录后才可以回帖 登录 | 免费注册

本版积分规则

手机版|关于我们|联系我们|ExcelHome

GMT+8, 2024-11-17 07:30 , Processed in 0.035298 second(s), 9 queries , Gzip On, MemCache On.

Powered by Discuz! X3.4

© 1999-2023 Wooffice Inc.

沪公网安备 31011702000001号 沪ICP备11019229号-2

本论坛言论纯属发表者个人意见,任何违反国家相关法律的言论,本站将协助国家相关部门追究发言者责任!     本站特聘法律顾问:李志群律师

快速回复 返回顶部 返回列表