Sep 202010

When GUIDs got introduced, people started using them as primary keys (which is ok) and, because of default, most of those GUIDs ended up as clustered index.

Clustered index itself tries to keep values that are close to each other in binary form also close in physical order. This is fine for data that is usually just increasing (like integer identity column) but it is not as good when you have random data. Unfortunately GUIDs (as created with NewId function) are closer to random than to any other order.

NewSequentialId function was introduced to solve this exact problem. It will return values in increasing order but with small catch – it is guarantied only until you restart Windows. Once you reboot Windows starting value might be lower than last one inserted.

Shit really hits the fan if you need to combine date from more than one server. Each one will generate completely different set of increasing GUIDs. This will wreak havoc for your clustered index. Fragmentation, here we come.

And what is result of this “fight”? Should one use NewId or NewSequentialId function?

It simply does not matter. Once you remove clustered index issue from decision process, either function works fine. And I cannot say that I see any scenario where clustered index on GUID is appropriate.

Am I missing something?

P.S. Personally, I would go for NewId – it is shorter to write and it is more user-friendly.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>