|
楼主 |
发表于 2014-10-25 09:05
|
显示全部楼层
本帖最后由 wcymiss 于 2014-10-26 10:10 编辑
处理数据的通用方法:
1、数组法:
用split和数组,循环将所需数据取出。
优点:不需其他对象辅助,起点低,会数组即可。
缺点:需要分析数据结构,对于复杂结构的数据,需要多步才能完成。
以作业一的第1题为例:- Sub Main()
- Dim strText As String
- Dim arrRow, arrCell
- Dim i As Long, j As Long, n As Long
- Dim arrColumn
- Dim arrData(1 To 1000, 1 To 10)
-
- With CreateObject("MSXML2.XMLHTTP")
- .Open "GET", "http://data.bank.hexun.com/lccp/jrxp.aspx", False
- .Send
- strText = .responsetext
- End With
-
- arrColumn = Array(, , 9, 12, 14, 16, 18, 20, 22, 24, 26)
- arrRow = Split(strText, "name='proTest' ")
- For i = 1 To UBound(arrRow)
- arrCell = Split(arrRow(i), ">")
- n = n + 1
- arrData(n, 1) = Split(Split(arrCell(0), "value='")(1), "'")(0)
- For j = 2 To 10
- arrData(n, j) = Split(arrCell(arrColumn(j)), "<")(0)
- Next
- Next
-
- Cells.Clear
- Range("a1:j1").Value = Split("产品名称 是否在售 银行 起售日 停售日 币种 管理期(月) 产品类型 预期收益(%) 收益类型", " ")
- Range("a2").Resize(n, 10).Value = arrData
- End Sub
复制代码 2、正则法:
用正则拆解字符串,提取匹配数据,循环取出。
优点:即便复杂结构的数据,也有可能一步到位。
缺点:需要学习正则知识。
70楼获取到QQ群成员清单后,用正则提取成员的昵称和QQ号:
- Sub Main()
- Const gc As String = "" '群号
- Const bkn As String = "" '从fiddler中获取
- Const uin As String = "" 'QQ号
- Const skey As String = "" '从fiddler中获取
- Dim strText As String
- Dim RegMatch As Object
- Dim arrData(1 To 1000, 1 To 2)
- Dim n As Long
-
- With CreateObject("WinHttp.WinHttpRequest.5.1")
- .Open "GET", "http://qinfo.clt.qq.com/cgi-bin/qun_info/get_group_members_new?gc=" & gc & "&bkn=" & bkn, False
- .setRequestHeader "Cookie", "uin=o" & uin & "; skey=" & skey
- .Send
- strText = .responsetext
- Debug.Print strText
- End With
-
- With CreateObject("VBScript.Regexp")
- .Global = True
- .Pattern = "{""b"":\d+,""g"":\d+,""n"":""([^""]*)"",""u"":(\d+)}"
- For Each RegMatch In .Execute(strText)
- n = n + 1
- arrData(n, 1) = RegMatch.submatches(0)
- arrData(n, 2) = RegMatch.submatches(1)
- Next
- End With
-
- Set RegMatch = Nothing
- Cells.Clear
- Range("a1:b1").Value = Array("昵称", "QQ号")
- Range("a2").Resize(n, 2).Value = arrData
- End Sub
复制代码
以上两种方法对于处理获取到的数据一般都能用,但都需先行分析获数据结构。对于复杂结构数据,需要时间和耐心。
table、xml、json各自还有一些专用的提取方法。
|
评分
-
1
查看全部评分
-
|