When using SharePoint as a platform for document management there is often a requirement to assign unique identifiers to documents. Document ID feature has been introduced in SharePoint 2010 and is also available in SharePoint 2013. Being limited to a site collection introduces a big disadvantage since it means that your document identifiers will be unique within one site collection. This would be a pretty useful feature except document management systems usually spread across multiple site collections or even web applications. There are two ways to ensure Document IDs are unique across site collections: either set a different Document ID prefix on each site collection or develop a custom Document ID provider. In this post I will describe what Document ID is and how it works, as well I’ll provide a solution to a Custom Document ID Provider and explain it’s details.
How the gears are actually turning
Lets start with how Document ID works in SharePoint. The whole thing consists of three main parts:
- Activation – Document ID feature;
- Activation – Document ID enable/disable timer job;
- Document ID Generator event receiver/Document ID assignment job
Document ID Service feature
This is a site collection feature which enables Document ID functionality on a site collection level. This needs to be activated in all site collections where you intend to use Document IDs. Once activated, this feature does a couple things:
- Adds a Document ID settings option under Site Collection Administration;
- Generates a Document ID prefix automatically;
- Checks the size of the site collection and depending on the result will either assign Document IDs straight away or create work items to do that. The function call used for that is DocIdHelpers.IsSiteTooBig(site, 1, 20, 40). Let me disassemble this for you: “Please check this site collection and if it has at least 1 web site or 40 lists or 20 libraries, consider it being TOOBIG“. *chuckle*. Notice ORs instead of ANDs.
Keep in mind that work items are created with a scheduled date and there is no way to change it. In case you don’t believe you can query ScheduledWorkItems table in the Content database:
This will return a work item which will be processed by Document ID enable/disable job and delivery date will be 30 minutes from the moment you activated the feature. This means that no matter how many times you run the timer jobs in the first 30 minutes after activating the feature, nothing will happen. Take a break, have a coffee and navigate to Document Id Settings. If you still see:
that means the timer job hasn’t done it’s job yet.
Top Tip #331: In case you can’t wait – change the system time, restart timer service and run the timer jobs.
There are two of them:
- Document ID enable/disable job
This will process all work items on all site collections in a web application, and make sure that Document ID prefix is pushed to all subsites. Document ID field will be added to all content types which inherit from Document and Document Set. More precisely, three fields will be added: Document ID, Document ID Value and Persist ID. In addition to adding these columns, SharePoint adds an event receiver to each of the content types so that they run every time a document or document set is uploaded to SharePoint. The server uses ItemAdded event to ensure that document ID providers can use item metadata when assigning document IDs.
- Document ID assignment job
This will push the settings to all lists and assign Document IDs to the documents.
Document ID Generator Event Receiver
When items are added to a site collection, SharePoint assigns or reassigns a Document ID. When a new item is added, SharePoint first checks to see if the item has a Document ID. If it does, it checks to see if Preserve ID attribute is set to True or False, and then sets it to False if it is currently set to True. If the item does not already have a Document ID, the server gets a Document ID for the item from the specified provider, writes it to metadata, and sets Preserve ID attribute to False. I hope this is not confusing.
This is how out-of-the-box Document ID works in a nutshell. Having a semi-fixed pattern and the fact that the functionality is limited to a site collection you end up with another bright idea ended up almost useless for an enterprise document management system.
Web application Document ID
Out-of-the-box functionality should be sufficient for some of the folk, but it wasn’t in my case. I needed the IDs to be in a specific format and notation as a requirement by the enterprise, so I had to implement a custom Document ID Provider. This had to spray unique identifiers across all site collections within a web application.
- Code to generate Document ID;
- List instance to store last Document ID and a Scheme;
- Settings page;
- Custom action to add a link to Settings page;
- A feature activate custom Document ID provider.
Document ID Provider
This is the core of the solution. To create a custom Document ID Provider we need to derive a class from Microsoft.Office.DocumentManagement.DocumentIdProvider. The class contains 3 abstract methods and 1 abstract property that we need to implement:
- public override string GenerateDocumentId(SPListItem listItem). This method is called when a new Document ID needs to be created. Current item is handed over to this method as a parameter. This is the place where you generate a Document ID or get it from a another system.
- public override string GetSampleDocumentIdText(SPSite site). Returns an example Document ID value that will be displayed in Document ID search web part. The method is called when Find By Document ID web part is rendered.
- public override bool DoCustomSearchBeforeDefaultSearch. This property determines how documents will be retrieved by their Document ID. If it’s set to False, documents will be retrieved using SharePoint Search first. If it’s set to True, GetDocumentUrlsById will be used before SharePoint Search. Note that this only defines the priority of the search method and that second search method will be used only if the first one doesn’t return a result.
- public override string GetDocumentUrlsById(SPSite site, string documentId). Returns an array of URLs pointing to documents with a specified Document ID. Implementing this has an advantage over using default SharePoint search – most of the time it’s faster: there is no need to wait for search crawls to finish. Otherwise, if you will be letting SharePoint Search do the work, this method should return an empty array of strings.
If DoCustomSearchBeforeDefaultSearch is True, then returning an empty array of strings will tell DocIdRedir.aspx to try searching again using SharePoint Search.
If DoCustomSearchBeforeDefaultSearch is False and neither SharePoint Search nor GetDocumentUrlsById returned any results, a message “No documents with the ID were found in this site collection” will be displayed.
Document ID List
I’ve chosen a list to store all information because of the following reasons:
- SPListItem.ID works well as a unique identifier.
- Use of SPList.Properties to store configuration data.
- Getting all information for Document ID from one object.
Let me explain how the whole thing comes together:
- A document is added to a library;
- Document ID Generator Event Receiver kicks in;
- Document ID Provider generates a Document ID and returns it.
This is how Document ID Provider generates a new ID:
So simple it didn’t even need a scheme
A Document ID is generated by merging an SPListItem.ID and a Scheme. Let’s take an example scheme of “DMS-0000000” and an ID of “5”. What I wanted it to end up with is “DMS-0000005” – not adding number 5 at the end but instead using zeros as a placeholder for numbers. This magic code here does the thing:
Since the string is being formatted twice, you might want to add various checks (or replacements like Replace(“E0”, “\\E0”) etc.) if the scheme includes any of the formatting keywords which you can find here.
An Application Page to edit and store Document ID settings in list’s property bag. For now, there is one editable setting on the page which is a Document ID scheme.
Custom Document ID Provider feature
Feature receiver needs to run some code to set your custom Document ID provider for a site collection:
This is how a custom Document ID provider works.
You can find the complete VS2013 project here. Modifying the solution in a few easy steps could make it Farm wide. Play around with the source code. There isn’t much of coding and it’s pretty straightforward.
Document ID is not generated
The most common one. Aghm,- probably the only one. Things that I would suggest to take a look at in this particular order:
- Check if the Document ID Service feature is activated. *DOH*
- Check Document ID Settings under Site Collection Administration and see if there is a message “Configuration of the Document ID feature is scheduled to be completed by an automated process”. If this is the case, wait 30 minutes or refer to the beginning of this post how to speed the Document ID provisioning process.
- Check if Document ID fields are available in current web.
- Check if Document ID Generator Event Receivers are added to the library.
- Check if Content Types are not marked as Read Only or Sealed.
- If you are using a custom library template, check if Document ID works in standard document library. If that’s the case – there might be issues with your template.
Hope this helps!