Are There Any Drawbacks To Relying On The System.Guid.NewGuid() Function When Looking For Unique IDs For Data?


Answer :

I'm looking to generate unique ids for identifying some data in my system.

I'd recommend a GUID then, since they are by definition globally unique identifiers.

I'm using an elaborate system which concatenates some (non unique, relevant) meta-data with System.Guid.NewGuid(). Are there any drawbacks to this approach, or am I in the clear?

Well, since we do not know what you would consider a drawback, it is hard to say. A number of possible drawbacks come to mind:

  • GUIDs are big: 128 bits is a lot of bits.

  • GUIDs are not guaranteed to have any particular distribution; it is perfectly legal for GUIDs to be generated sequentially, and it is perfectly legal for the to be distributed uniformly over their 124 bit space (128 bits minus the four bits that are the version number of course.) This can have serious impacts on database performance if the GUID is being used as a primary key on a database that is indexed into sorted order by the GUID; insertions are much more efficient if the new row always goes at the end. A uniformly distributed GUID will almost never be at the end.

  • Version 4 GUIDs are not necessarily cryptographically random; if GUIDs are generated by a non-crypto-random generator, an attacker could in theory predict what your GUIDs are when given a representative sample of them. An attacker could in theory determine the probability that two GUIDs were generated in the same session. Version one GUIDs are of course barely random at all, and can tell the sophisticated reader when and where they were generated.

  • And so on.

I am planning a series of articles about these and other characteristics of GUIDs in the next couple of weeks; watch my blog for details.

UPDATE: https://ericlippert.com/2012/04/24/guid-guide-part-one/


When you use System.Guid.NewGuid(), you may still want to check that the guid doesn't already exist in your system.

While a guid is so complex as to be virtually unique, there is nothing to guarantee that it doesn't already exist except probability. It's just incredibly statistically unlikely, to the point that in almost any case it's the same as being unique.

Generating to identical guids is like winning the lottery twice - there's nothing to actually prevent it, it's just so unlikely it might as well be impossible.

Most of the time you could probably get away with not checking for existing matches, but in a very extreme case with lots of generation going on, or where the system absolutely must not fail, it could be worth checking.

EDIT

Let me clarify a little more. It is highly, highly unlikely that you would ever see a duplicate guid. That's the point. It's "globally unique", meaning there's such an infinitesimally chance of a duplicate that you can assume it will be unique. However, if we are talking about code that keeps an aircraft in the sky, monitors a nuclear reactor, or handles life support on the International Space Station, I, personally, would still check for a duplicate, just because it would really be terrible to hit that edge case. If you're just writing a blog engine, on the other hand, go ahead, use it without checking.


Feel free to use NewGuid(). There is no problem with its uniqueness.

There is too low probability that it will generate the same guid twice; a nice example can be found here: Simple proof that GUID is not unique

var bigHeapOGuids = new Dictionary<Guid, Guid>(); try {    do    {       Guid guid = Guid.NewGuid();       bigHeapOGuids.Add(guid ,guid );    } while (true); } catch (OutOfMemoryException) { } 

At some point it just crashed on OutOfMemory and not on duplicated key conflict.


Comments

Popular posts from this blog

Chemistry - Bond Angles In NH3 And NCl3

Are Regular VACUUM ANALYZE Still Recommended Under 9.1?

Change The Font Size Of Visual Studio Solution Explorer