class: center, middle # A Brief Chat About Identifiers --- class: center, middle # Whats an Identifier? --- class: center, middle ![](https://images.onlinelabels.com/Images/Predesign/00000002/1116/Blue-Name-Tag.png) --- class: center, middle ![](https://upload.wikimedia.org/wikipedia/commons/thumb/5/5d/UPC-A-036000291452.png/220px-UPC-A-036000291452.png) --- class: center, middle ![](https://upload.wikimedia.org/wikipedia/commons/8/84/Seriennummer.JPG) --- class: middle - Brian - 1234 - https://en.wikipedia.org/wiki/University_of_Chicago - http://example.org/ark:/12025/654xz321/s3/f8.05v.tiff - uid:c21a5d48-0263-4060-ac68-eb0fe591ddd4 - c21a5d48-0263-4060-ac68-eb0fe591ddd4 --- # What are identifiers for? .center[**Identifying** things] -- .center[...] .center[duh.] --- # What _exactly_ is identification? -- .center[![](https://upload.wikimedia.org/wikipedia/commons/8/83/Naming_and_Necessity.jpg)] .footnote[https://en.wikipedia.org/wiki/Naming_and_Necessity] --- # What _exactly_ is identification? ### Let's use some math -- - a **set** is a collection of distinct objects, considered as an object in its own right. - an **element** of a set is any one of the distinct objects that make up that set .footnote[https://en.wikipedia.org/wiki/Set_(mathematics)#Membership https://en.wikipedia.org/wiki/Element_(mathematics)] -- Identification, in this sense, is the isolation of **one** element in a set. -- How do we know whats in a set? -- We define (explicitly and/or via a function) all the elements in it. --- # Functions In mathematics, a **function** is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output. .center[![](https://upload.wikimedia.org/wikipedia/commons/thumb/3/3b/Function_machine2.svg/220px-Function_machine2.svg.png)] Sets can be constructed explicitly: {1, 2, foo, bar, 7.5} or via functions: {f(x) for x in {inputs}} .footnote[https://en.wikipedia.org/wiki/Function_(mathematics)] --- # Domains Domains can be thought of as a special kind of set that represents all potential outputs of a function for which that function is defined. domain = {f(x) for x in {**all potential inputs**}} .footnote[https://en.wikipedia.org/wiki/Domain_of_a_function] -- Domains are important for a couple of reasons: - If two domains as defined by two different functions don't intersect then those functions will _never_ generate an output that "collides" with the other. - Functions which map one domain to another can have some interesting properties - When a function maps from one domain to another we call the "input" domain the domain, and the domain the function is mapping onto the "codomain" - Kinds of relations (pictures to follow) - injective: All elements in the domain map to a distinct element in the codomain - surjective: All elements in the codomain are mapped to by **at least** one element in the domain. - bijective: injective && surjective --- class: center # Injective (one to one) ![injective](https://upload.wikimedia.org/wikipedia/commons/thumb/0/02/Injection.svg/200px-Injection.svg.png) All elements in the domain map to a distinct element in the codomain --- class: center # Surjective (onto) ![surjective](https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Surjection.svg/200px-Surjection.svg.png) All elements in the codomain are mapped to by **at least** one element in the domain. --- class: center # Bijective Injective && Surjective ![bijective](https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Bijection.svg/200px-Bijection.svg.png) All elements in the domain map to a distinct element in the codomain and All elements in the codomain are mapped to by **at least** one element in the domain. -- At least one + Mapping to distinct elements == for each input, one unique output --- # A Note on Cardinality The "cardinality" of a set is the measure of how many things are in it. Cardinality importantly relates to whether or not it is possible to create injective, surjective, or bijective functions which map between two sets. Generalized: Given two sets, A and B: - |A| = |B| if there exists a bijective function between A and B - |A| <= |B| if there exists an injective function from A to B - |A| < |B| if there is an injective function, but no bijective function from A to B **Cardinalities can be compared between two infinite sets**, and thus the injective, surjective, and bijective nature of functions that map between infinite sets can be determined. .footnote[https://en.wikipedia.org/wiki/Cardinality] --- # So, what exactly is identification? Given a set of identifiers, confined within some domain, and a set of Things™, confined within some codomain, we can use some function f such that f is bijective from the domain of the identifiers to the codomain of the Things™, thereby **identifying** them. Essential Parts to Consider: - Domain of identifiers - Domain of Things™ - Set of identifiers - Set of Things™ - A function that bijectively maps elements in the set of identifiers to elements in the set of Things™ --- # Domain of Identifiers - The domain of identifiers should be well defined - The domain of identifiers shouldn't collide with pre-existing domains - https://en.wikipedia.org/wiki/Category:Identifiers - myriad smaller, undocumented systems - systems with implicit scope - The cardinality of the domain of identifiers must be >= the cardinality of the set of Things™ - assuming, of course, you want all things mapped at all times. - The mapping function **must not** be surjective and not bijective from the domain of identifiers to the codomain of Things™ - A single identifier shouldn't map to two Things™ **Re-evaluate these constraints if changing the domain** --- # Domain of Things™ - the domain of Things™ should be well defined - Usually has some constraints based on the Things™ - Informs the domain of identifiers (particularly, the cardinality) as detailed previously - Importantly, can sometimes be finite - Also importantly, can cause major problems when assumed to be finite but turns out to be infinite **Re-evaluate these constraints if changing the domain** --- # Set of Identifiers, and Set of Things™ Often, before generating identifiers you have a set of Things™ Sometimes, before seriously considering your identifier scheme, you have a set of pre-existing identifiers Be **sure** that any pre-existing elements fall within your defined domain, and keep in mind that the more complex the function to define the domain is, the more complex the rest of the considerations necessarily become. These sets most likely represent (or will represent) a _significant_ amount of state information which must be managed in your system. --- # Mapping Function - Should be designed with the identifier domain and Thing™ domain in mind - Often _heavily_ informed by context --- So, now we have a solid understanding of what an identifier is abstractly, as well as what the action of identification is. This mathematical model provides a great base for understanding identifiers, and the act of identification, but... -- it masks quite a bit of complexity in a couple of ways: -- - It doesn't concern itself with the particulars of the required function(s), it only stipulates the properties the functions must have -- - It also deals with 'functions' only in the mathematical sense, glossing over the function_al_ requirements of a system mediated by computers and networks. -- - It doesn't take into account the constraints of the encapsulating system, _why_ you're generating the identifiers, how, when, etc. -- - It only defines one 'layer' of an identifier system, which often employ several layers of resolving identifiers, to new identifiers, to another identifier [...] until the final identifier resolves to the Thing™ -- - It doesn't take into account the knowledge transfer problems inherent in widely using identifiers - How do you you communicate the domains? - How do you communicate the agreed upon functions? - How do you manage state across a distributed identifier system? --- class: middle .center[**Thank you**] .center[Questions?] .center[Brian Balsamo] .center[brian@brianbalsamo.com]