集合與泛型集合中Contains方法處理效能測試

Contains方法

Contains()方法在很多地方都可以使用,主要是用來比較指定物件內是否含有指定的項目。例如:String 型別的 Contains()方法就經常拿來搜尋或過濾字串使用。

    Dim YesNo As Boolean = "Bruce很帥XD".Contains("帥")
    Console.WriteLine(If(YesNo, "帥", "不帥"))
    Console.ReadLine()
   

但今天想瞭解在含有大量資料時,集合與泛型集合內的Contains方法的處理速度。

測試原因

想瞭解與測試的原因是因為亂馬客寫了一篇 .NET]產生不重覆的數值(List vs HashSet) 的文章,他測試結果是正確的,不過我認為測試範圍可以加大到集合與泛型集合,所以有了以下測試程式碼與測試結果。程式碼參考亂馬客文章進行修改,主要加入了 Counter 變數,記錄下 while 迴圈的執行次數,這樣比較能瞭解誤差值。

集合與泛型集合 - Contains方法

程式碼主要是要產生不重覆的數值,所以重點其實只在 while 那一行程式碼,這裡我加大數值的量,由 1 ~ 100000 裡取出不重覆的 90000 個數值。因為我們只單存取一個數值,所以 Key/Value 的集合與泛型集合就不在測試程式碼之內。

程式碼大同小異,主要只有測試型別的差異,看完第一組測試程式碼,應該就可以直接看結果。

  Module Module1

      Sub Main()
          Dim rand As New Random()
          Const minValue As Integer = 0
          Const maxValue As Integer = 100000
          Const lootValue As Integer = 90000

          ' ArrayList
          Dim arraylistResult As New ArrayList()
          Dim arraylistCounter As Integer = 0

          Dim arraylistTime As New Stopwatch()
          arraylistTime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While arraylistResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  arraylistCounter += 1
              End While
              arraylistResult.Add(currentValue)
          Next

          Dim arraylistArray() As Integer = arraylistResult.Cast(Of Integer).ToArray()
          arraylistTime.Stop()

          Console.WriteLine("While次數: {0}, ArrayList比較花費時間: {1}", arraylistCounter, arraylistTime.Elapsed)
          Console.ReadLine()

          ' List(Of T)
          Dim listResult As New List(Of Integer)()
          Dim listCounter As Integer = 0

          Dim listTime As New Stopwatch()
          listTime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While listResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  listCounter += 1
              End While
              listResult.Add(currentValue)
          Next

          Dim listArray() As Integer = listResult.ToArray()
          listTime.Stop()

          Console.WriteLine("While次數: {0}, List(Of T)比較花費時間: {1}", listCounter, listTime.Elapsed)

          ' Queue

          Dim queueResult As New Queue()
          Dim queueCounter As Integer = 0

          Dim queueTime As New Stopwatch()
          queueTime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While queueResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  queueCounter += 1
              End While
              queueResult.Enqueue(currentValue)
          Next

          Dim queueArray() As Integer = queueResult.Cast(Of Integer).ToArray()
          queueTime.Stop()

          Console.WriteLine("While次數: {0}, Queue比較花費時間: {1}", queueCounter, queueTime.Elapsed)

          ' Queue(Of T)

          Dim queueTResult As New Queue(Of Integer)
          Dim queueTCounter As Integer = 0

          Dim queueTTime As New Stopwatch()
          queueTTime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While queueTResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  queueTCounter += 1
              End While
              queueTResult.Enqueue(currentValue)
          Next

          Dim queueTArray() As Integer = queueTResult.ToArray()
          queueTTime.Stop()

          Console.WriteLine("While次數: {0}, Queue(Of T)比較花費時間: {1}", queueTCounter, queueTTime.Elapsed)

          ' Stack
          Dim stackResult As New Stack()
          Dim stackCounter As Integer = 0

          Dim stackTime As New Stopwatch()
          stackTime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While stackResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  stackCounter += 1
              End While
              stackResult.Push(currentValue)
          Next

          Dim stackArray() As Integer = stackResult.Cast(Of Integer).ToArray()
          stackTime.Stop()

          Console.WriteLine("While次數: {0}, stack比較花費時間: {1}", stackCounter, stackTime.Elapsed)

          ' Stack(Of T)
          Dim stackTResult As New Stack(Of Integer)()
          Dim stackTCounter As Integer = 0

          Dim stackTTime As New Stopwatch()
          stackTTime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While stackTResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  stackTCounter += 1
              End While
              stackTResult.Push(currentValue)
          Next

          Dim stackTArray() As Integer = stackTResult.ToArray()
          stackTTime.Stop()

          Console.WriteLine("While次數: {0}, stack(Of T)比較花費時間: {1}", stackTCounter, stackTTime.Elapsed)

          ' LinkedList - Contains

          Dim linkResult As New LinkedList(Of Integer)()
          Dim linkCounter As Integer = 0

          Dim linkTime As New Stopwatch()
          linkTime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While linkResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  linkCounter += 1
              End While
              linkResult.AddLast(currentValue)
          Next

          Dim linkArray() As Integer = linkResult.ToArray()
          linkTime.Stop()

          Console.WriteLine("While次數: {0}, LinkedList比較花費時間: {1}", linkCounter, linkTime.Elapsed)

          ' SortedSet - Contains
          Dim sortedResult As New SortedSet(Of Integer)()
          Dim sortedCounter As Integer = 0

          Dim sortedTime As New Stopwatch()
          sortedTime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While sortedResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  sortedCounter += 1
              End While
              sortedResult.Add(currentValue)
          Next

          Dim sortedArray() As Integer = sortedResult.ToArray()
          sortedTime.Stop()

          Console.WriteLine("While次數: {0}, SortedSet比較花費時間: {1}", sortedCounter, sortedTime.Elapsed)

          ' HashSet - Contains

          Dim hashResult As New HashSet(Of Integer)()
          Dim hashCounter As Integer = 0

          Dim hashtime As New Stopwatch()
          hashtime.Start()

          For i = 1 To lootValue
              Dim currentValue As Integer = rand.Next(minValue, maxValue)
              While hashResult.Contains(currentValue)
                  currentValue = rand.Next(minValue, maxValue)
                  hashCounter += 1
              End While
              hashResult.Add(currentValue)
          Next

          Dim hashArray() As Integer = hashResult.ToArray()

          Console.WriteLine("While次數: {0}, HashSet比較花費時間: {1}", hashCounter, hashtime.Elapsed)
          Console.ReadLine()
      End Sub
  End Module  
  
型別 While次數 花費時間(秒)
ArrayList 142516 53.9209476
List(Of T) 139819 25.6263779
Queue 140608 1:06.0041322
Queue(Of T) 140505 1:01.2269493
stack 141081 56.5163553
stack(Of T) 140443 24.7875047
LinkedList 141833 25.5178281
SortedSet 139665 00.0798503
HashSet 139843 00.0235020

如果要處裡這種類似「不重覆資料」的需求,很明顯可以看得出來 Set 類別大獲全勝,Set 類別在 Contains 方法的處理效率極佳。其實在 MSDN 對 HashSet(Of T) 的說明:HashSet(OfT) 類別會提供高效能的集合作業。但要使用 Set 類別有些許限制:

使用 HashSet 類別的限制:

  1. .NET Framework 支援版本:4.5、4、3.5
  2. 支援平台:Windows 8, Windows Server 2012, Windows 7, Windows Vista SP2, Windows Server 2008 (不支援伺服器核心角色), Windows Server 2008 R2 (SP1 (含) 以後版本支援伺服器核心角色,不支援 Itanium)

使用 SortedSet 類別的限制:

  1. .NET Framework 支援版本:4.5、4
  2. 支援平台:Windows 8, Windows Server 2012, Windows 7, Windows Vista SP2, Windows Server 2008 (不支援伺服器核心角色), Windows Server 2008 R2 (SP1 (含) 以後版本支援伺服器核心角色,不支援 Itanium)

在.NET Framework 2.0 環境還是要使用 List(Of T) 為佳,到了 .NET Framework 3.5 / 4.0 / 4.5 就可以使用更高效的 HashSet(Of T) 類別來處理。

以上文章只是在測試 Contains方法,且取不重覆數字的方法有很多種,例如:C#的解题思路(1):不重复随机数的产生问题生成不重复随机数和固定数组乱序的问题

小結

這種數據應該拿來解讀為:「那一種需求應該採用那一種資料類別。」會比較合適,而不是效能的比較,因為我們的程式也只是測試 Contains方法。每種資料類別本來就是針對不同情境所設計,例如在資料結構我們都會學習搜尋演算法一樣,每種搜尋演算法都有自己的優勢。

另外,從測試數據裡也發現一個有趣的事,以 Contains方法的處理而言,除 Queue(Of T) 之外的泛型集合的處理時間都差不多。以我們不重覆數值的範例而言,在 .NET Framework 2.0 環境並非一定要 List(Of T) 不可。

參考資料

沒有留言:

張貼留言

感謝您的留言,如果我的文章你喜歡或對你有幫助,按個「讚」或「分享」它,我會很高興的。