|
第一次发贴,新手上路,请多指教.
如题,提取任意多行多列中每列都有的重复值,数据量大速度快.这也因有坛友需要编写的,也是边学边编,肯定有更好的算法,希望大家一起讨论交流,相互提高.
- Sub 提取多列中每列都有的重复值()
- '利用排序和二分查找降低时间复杂度,提高效率
- Dim arr
- Dim arr1(), arr_temp()
- Dim i, j, m, g, fi, max_row, max_col As Integer
- 'i为行下标,j为列下标,m为一个数重复次数,g为重复数的多少
- 'max_row为区域最大行,max_col为区域最大列
- t = Timer
- '复制到ss1开始的区域,并对各列(不含第1列)进行排序
-
- Range("a1").CurrentRegion.Copy [ss1]
- Set rng = [ss1].CurrentRegion
- max_row = rng.Rows.Count
- max_col = rng.Columns.Count
- For j = 2 To max_col
- rng.Range("a1:a" & max_row).Offset(0, j - 1) _
- .Sort rng.Cells(1, j), xlAscending
- Next j
- arr = rng '区域转成二维数组
- Range("ss1").CurrentRegion.Clear '清空临时区域
- '查找提取各列中都有的重复值
-
- For i = 1 To max_row
- m = 0
- For j = 2 To max_col
- '利用自定义二分查找函数查找
- fi = twofind(arr, j, arr(i, 1))
- If fi = 1 Then '返回1为找到相同数
- m = m + 1
- Else
- Exit For '没找到退出本轮查找
- End If
-
- If m = max_col - 1 Then
- g = g + 1 '找到各列都含有的重复值,加入数组arr1
- ReDim Preserve arr1(1 To g)
- arr1(g) = arr(i, 1)
- End If
- Next j
- Next i
- Range("h2").Resize(UBound(arr1), 1) = Application.WorksheetFunction.Transpose(arr1)
- MsgBox Timer - t
- End Sub
- '2分查找是否在有序数组里,数据量大时非常快,但字母和汉字比较大小有问题
- Function twofind(arr, col, value) '参数:被查的二维数组,列号,要找的元素值
- Dim L, R, mid As Long
- L = 1: R = UBound(arr)
- Do While R >= L
- mid = (L + R) \ 2
- If value < arr(mid, col) Or arr(mid, col) = "" Then
- R = mid - 1
- twofind = 0
- ElseIf value > arr(mid, col) Then
- L = mid + 1
- twofind = 0
- Else
- twofind = 1
- Exit Do
- End If
- Loop
- End Function
复制代码
以下是早前编的,在数据量大的情况下速度慢了几十倍.
- Sub 提取多列中每列都有的重复值()
- Dim arr
- Dim arr1()
- Dim i, j, k, m, g As Integer
- 'i、k为行下标,j为列下标,m为一个数重复次数,g为重复数的多少
- t = Timer
- arr = Range("a1").CurrentRegion
- For i = 1 To UBound(arr)
- j = UBound(arr, 2)
- m = 0
- k = 1
- Do While k <= UBound(arr)
- If arr(i, 1) = arr(k, j) Then
- m = m + 1: j = j - 1: k = 0
- If j = 1 Then Exit Do
- End If
- k = k + 1
- Loop
-
- If m = UBound(arr, 2) - 1 Then
- g = g + 1
- ReDim Preserve arr1(1 To g)
- arr1(g) = arr(i, 1)
- End If
- Next
- Range("g2").Resize(UBound(arr1), 1) = Application.WorksheetFunction.Transpose(arr1)
- MsgBox Timer - t
- End Sub
复制代码
|
|