wicked programmer
helping developers everywhere innovate and learn
  • Home
  • Events
  • Publications
  • Learning
  • About

General

Quality Issue #1 – Count vs. Count()

Wicked ProgrammerOctober 2, 2021

A measure of how good you are as a developer is how well you can write code. This is a start of a series of posts to help developers write better code. Today we will look at Count vs. Count() in .NET.

Count

There are many collections in .NET that support a property called Count. This includes List, List<T>, HashSet<T>, and many more. The Count property is a value the represents the number of elements in the collection.

Here is the documentation for the List<T>.Count property.

List<T>.Count Property (System.Collections.Generic) | Microsoft Docs

Count()

.NET supports two interfaces, IEnumerable and IEnumerable<T>, which provide the ability to iterate over a collection. Along with these interfaces are a set of static extension methods defined in Enumerable which add functionality for querying a collection of objects based on LINQ. One of those methods is Count() which iterates over the collection to determine the number of items in the collection.

Here is the documentation for the IEnumerable interface and Enumerable.Count method.

IEnumerable Interface (System.Collections) | Microsoft Docs

IEnumerable<T> Interface (System.Collections.Generic) | Microsoft Docs

Enumerable.Count Method (System.Linq) | Microsoft Docs

Comparison

So why do we care about the difference between Count and Count()?  One simply reads a value in memory to determine the count of the elements in a collection and the other iterates over the entire collection in memory to determine the count of the number of items.

There is a big performance difference when it comes to these two approaches. What is worse is that the performance difference gets worse when the size of the collection grows. These might now seem like a big deal, but it does add up over time.  If you have a large code base or high scale application, you will begin to see the impact over time.

Here is an example of code that gets a collection of people using List<Person>. We then use Count and Count() to get the number of people in the collection.

We measure the performance difference between the two approaches. One can see that Count() extension method takes 7 times longer than using the Count property. This gets worse when the number of items in the collection increases.

If you don’t believe that this is a quality issue, check out the code analysis tip in Visual Studio by hovering over the Count() method. There you will see code analysis rule CA1829.

The description of rule CA1829 provides a clear reason as to why not to use the Count() method.

“The Count LINQ method was used on a type that supports an equivalent, more efficient Length or Count property.”

Here is the documentation for the Code Analysis Performance Rule CA1829.

CA1829: Use Length/Count property instead of Enumerable.Count method (code analysis) – .NET | Microsoft Docs

Anecdote

I remember being one of a four architects on Fidelity’s Active Trader Pro. This is amazing product put together by about 80 to 90 awesome developers. Count vs. Count() was one of the performance problems we would find in our code reviews. An even bigger challenge was the overuse of LINQ queries in a fluent style programming syntax. Finding bad LINQ queries was a large part of our performance optimization during the project. This leads me to one of my favorite things to tell developers, “LINQ is convenient not performant”. Interestingly, I am on a project at the moment where we are addressing quality issues such as Count vs. Count().

Conclusion

It is out hope that you have learned the proper the use of Count vs. Count() and that this is the beginning of your journey to improve the quality of the code that you write.

Appreciation

Thanks to our friends at MILL5 for sponsoring this article.

Author(s):

Richard Crane, Founder/CTO

Disclaimer:

All source code is licensed under the Apache 2.0 license.

References:

Count vs. Count() Code Example

https://github.com/MILL5/quality/tree/main/fundamentals/Count

General

Introducing BitArray serialization for Newtonsoft.Json and System.Text.Json

Wicked ProgrammerAugust 11, 2021

We are releasing the M5.BitArraySerialization.Json to NuGet. This library allows serialization of the BitArray class in .NET using JSON. Support for custom JSON converters for both Newtonsoft.Json and System.Text.Json.

Serialization using Newtonsoft.Json

Just add the Newtonsoft.Json.BitArrayConverter to your serializer settings.

Serialization using System.Text.Json

Just add the System.Text.Json.BitArrayConverter to your serializer options.

We even have support for automatic compression of the BitArray using Brotli compression when the size of the array is large enough to take advantage of compression. We leverage this capability in our M5.BloomFilter implementation to decrease the size of our bloom filter when transferring it across the network. More on this topic in a future article.

Enjoy using M5.BitArraySerialization.Jsonfor your own needs.

Author(s):
Richard Crane, Founder/CTO

Disclaimer:

M5.BitArraySerialization.Json is licensed under the Apache 2.0 license.

References:

BitArray Class (System.Collections) | Microsoft Docs

M5.BitArraySerialization.Json – NuGet

https://www.nuget.org/packages/M5.BitArraySerialization.Json/

M5.BitArraySerialization.Json – GitHub

https://github.com/MILL5/M5.BloomFilter/tree/main/M5.BitArraySerialization.Json

General

Introducing FastSearch, a very fast string search for objects and lookups

Wicked ProgrammerAugust 5, 2021

Welcome to our first article regarding fast in-memory search of a list of objects using FastSearch, a .NET class library for fast string-based search brought to you by the team at MILL5.

The motivation behind this library is simple, we want very fast search of a large list of objects based on strings so that we can display the results to users. Of course, there are many ways search can be done in .NET by writing very little code. One way is to loop through a list of objects using the String class to search string properties of your objects. Unfortunately, this type of search is not very fast and gets worse the larger the list gets.

A variation on using the String class is to use LINQ. It is extremely easy to write a small amount of LINQ code which queries a list of objects. The amazing part about this approach is how simple this is and how much .NET developers rely on LINQ queries every day. Unfortunately, there is a lot of LINQ code that is the source of performance problems everywhere.

String Contains and LINQ queries are not optimized and require some help to get better search performance. We will see the different optimizations we do per algorithm and compare their performance. Here is the list of algorithms we will be comparing.

Algorithm

Description

String Contains

Search a list using String.Contains method

LINQ

Search a list using a LINQ query

Hash

Search a list by precomputing hashes for all possible search patterns

Character Sequence

Search a list using a precomputed index structure which maintains a character sequence tree of all possible search pattens

Each of these algorithms will have different performance characteristics for indexing and searching. Let us look at the index performance for each algorithm.

Chart, bar chart

Description automatically generated

Notice that the Hash and Character Sequence algorithms takes significantly longer to index than the String Contains and LINQ algorithms. That is because very little is done to build an index or precompute values for searching. Fortunately for the Hash and Character Sequence algorithms we assume that building indexes happen once or so infrequently that our users will not be impacted by the overhead of the indexing process. Of course, when we do have to update our indexes, we do so in the background so that the performance impact is not observed by our users. Once the indexes are rebuilt, we swap out the old indexes for the new indexes.

Let us turn our attention to search performance. Notice that the String Contains and LINQ algorithms take a significantly long time to search. That is because we have not done much to improve the performance on these algorithms. Instead, we put the performance optimizations into the Hash and Character Sequence algorithms.

Chart, bar chart

Description automatically generated

Notice that searching using the Hash and Character Sequence algorithms is very fast compared to String Contains and LINQ. Most of the optimization is due to the data structure used for the index. We do perform other optimizations like precomputing case insensitive strings, binary searching, and parallelism, but these optimizations pale in comparison to the performance gains by using efficient data structures.

Table

Description automatically generated

The Hash algorithm uses a map based on precompute hash of all possible search patterns within the list of objects. When a search is performed, the hash of the search pattern is computed and used to find all matching objects within the map.

The Character Sequence algorithm builds a tree of all possible character sequences. When a search is performed, the search pattern is broken into its character sequence. The search is performed by walking the tree to find all matching objects that contain that sequence.

The Hash and Character Sequence algorithms offer significant performance improvements over the String Contains and LINQ algorithms. The clear winner though is the Character Sequence algorithm. This is due to having the fastest search performance and better index performance that the Hash algorithm. The Character Sequence algorithm is also a very compact data structure and is very efficient and uses very little memory.

image

Take note that the search performance for the Hash and Character Sequence is 1235 times more scalable and with a significant increase in performance (i.e., decrease in time spent). That frees up resources for your application which it can use to perform other operations.

In a future article, we will go into the data structures used for each of these algorithms. In addition, we will be improving this library over time to offer better and faster searching. If you want to use the FastSearch library, add it to your .NET project from NuGet at https://github.com/MILL5/FastSearch.

Enjoy using FastSearch for your own needs.

Author(s):
Richard Crane, Founder/CTO

Special Thanks:

Steve Tarmey, Principal Architect

James Pansarasa, Chief Architect

Disclaimer:

FastSearch is licensed under the Apache 2.0 license.

References:

Fast Search – NuGet

https://www.nuget.org/packages/FastSearch/

FastSearch – GitHub

https://github.com/MILL5/FastSearch

Rabin–Karp algorithm

https://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm

Boyer–Moore–Horspool algorithm

https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm

General

Searching for the “Perfect” Laptop!

Wicked ProgrammerJune 7, 2019

As a consultant and a software developer, we need to be road warriors.  Hence the need for a laptop.  With the work that we do at MILL5, we are often called upon to do some amazing things quickly.  That means productivity is extremely important.  Then those amazing things that I spoke of are usually big things like “Make my entire application go really, really, really fast while doing millions of things”.  All of this means that we need killer laptops.

Here are our specifications for what we feel is the “Perfect” Laptop.

CPU:  8th Generation i9 8950HK (minimum), the more cores and faster the better

GPU:  RTX 2080 with or without Max-Q w/ 8GB

Memory: 64GB or better, 2666MHz or better

Screen Size:  15″ Screen

Screen Type: OLED

Screen Resolution: 3840 x 2160

Biometrics:  Windows Hello (minimum)

Storage:  Minimum of 3 Drives, All SSD, NVMe, PCIe 3.0 or Better

Storage Capabilities:  RAID 0

Ports:  Several USB-C and one HDMI

Battery:  99 Wh

Keyboard:  Per Key RGB

Bling:  Gaming Lights would be Cool!  (i.e. Alienware)

*** MUST RUN COOL ON IDLE ***

*** MUST HAVE ENOUGH COOLING TO RUN UNDER AVERAGE LOAD WITHOUT ENGAGING FANS ***

*** FANS MUST BE PREMIUM IF REQUIRED TO BE RUNNING ***

*** IDEALLY FANLESS AND SILENT ***

1234›»







Back to Top

© wicked programmer 2023