A new update has come to Microsoft Syntex called taxonomy tagging. This allows existing Managed Metadata (termstore) columns to be added to a library and then AI will automatically tag documents matching terms in the termstore. The tagged documents can then help users to find documents through browsing and search.
No rules or training of a model are required to setup taxonomy tagging. It is all configured by the end user in a library on the column settings of a managed metadata column. Here there is one configuration option a Yes/No toggle named Automatically tag documents with terms. As the name implies the feature only works with managed metadata (termstore) columns and Taxonomy tagging is only available for the following file types: .doc, .docx, .pdf, and .pptx.
What is great about this feature is organisation’s may already have extensive term sets in SharePoint containing commonly used terms in their organisation i.e. suppliers, departments, assets, locations etc. A column can then be configured in a library pointing to the supplier term set for example. Then if any documents are added to the library containing a particular supplier name i.e. Coca-Cola they will then be auto tagged with Coca-Cola in the supplier column. A maximum of three columns in a library can be configured with taxonomy tagging
This is currently only available if Syntex pay as you go is configured in your tenant & linked to an Azure subscription. Currently billing is charged per processed document whereby each document counts as one transaction and it is currently priced at $0.05 per document processed. Syntex Taxonomy tagging is only available in sites where Syntex Taxonomy tagging is enabled and only on new content added to the libraries.
Setup Microsoft Syntex Taxonomy Tagging
Below I will walk you through the process of enabling Syntex Taxonomy Tagger in your tenant and create an Airlines & Aeroplane manufacturer example using existing term sets and demo documents I have.
In the M365 Syntex admin centre ensure your Azure subscription is setup and then click on Manage Microsoft Syntex.
Click on Taxonomy tagging
Here are the current configuration options to select the SharePoint libraries where you would like to enable Taxonomy tagging
There are three options – Libraries in all SharePoint sites, libraries in selected sites (search for site or upload a csv of site urls) or no SharePoint libraries. I’m ok with Taxonomy tagging being used on all sites so selected Libraries in all SharePoint sites.
NOTE – slightly confusing the wording talks about enabling Microsoft Syntex taxonomy tagging on libraries but then only allows you to select sites. There is presently no ability to select particular libraries – so if a site is selected then all libraries in that site would have the ability to be enabled for Syntex taxonomy tagging – which might not be what you wanted.
That’s literally all the admin configuration needed & is possible at the current time!
Library configuration for Syntex Taxonomy Tagging
Now that Syntex Taxonomy tagging is setup at the admin level it needs to be configured on a library – I will go through the setup below.
First I will step back and show the Aeroplane Manufacturer and Airlines term sets in the SharePoint tenant admin site. I will then use these term sets as the source for the Aeroplane Manufacturer & Airline columns I am going to use with Taxonomy tagging.
Next I created a new library called Taxonomy Tagging and then added a new column of type Managed Metadata. I linked it to the already created Airlines term set then ensured the Automatically tag document with terms toggle was set to Yes. Then clicked Save.
I then did the same and created an Aeroplane Manufacturer column linked to the Aeroplane Manufacturer term set. Ensuring the Automatically tag document with terms toggle was set to Yes to activate Syntex Taxonomy tagging.
My library is now configured with 2 x Managed Metadata fields for Airline & Aeroplane Manufacturer. Now I will add some Aviation Incident Reports. I previously used these in an Aviation Incident Report document understanding model I uploaded to the PnP Syntex Samples site. You can download the model and the report files from the the PnP Syntex Samples site.
I then added 12 files to the library. I know these files contain references to Airlines and Aeroplane manufacturers so I would expect them to be tagged. A sample of one of the files is below and we can see both Airline “American Airlines” & Aeroplane Manufacturer “Airbus” are mentioned.
The time estimated from document upload to the document getting tagged is currently specified as a minimum of 20 minutes and a maximum of 24 hours so we now have to wait!
I then waited 24 hours and this was the final result (see image below). It looks like all the Airlines were correct identified and tagged however only one document was tagged with an Aeroplane manufacturer “Airbus“.
However all of the Aviation Incident report documents contain Boeing, Airbus McDonnel Douglas etc so I would have expected them to be also tagged. As I said before the beauty of Taxonomy tagger is there is no complicated setup but this does mean we cannot tweak the model to adjust the rules and see why manufacturer values were not extracted.
The two documents at the bottom do not contain references to Airline or Aeroplane manufacturer so are not tagged.
Taxonomy tagging is a great addition to the Syntex suite. Many organisations will already have extensive managed metadata term sets containing their organisational terms i.e. clients, departments, locations, document types. These terms are likely used throughout their SharePoint environment and attached to columns so documents can be tagged. Previously this relied on users to manually tagging documents, default metadata or creating Syntex document processing models. Now the taxonomy tagging service can with very minimal configuration scan uploaded documents and tag them with terms from the document. These documents can then be filtered/grouped in the library or used with search to for example find all the documents in the tenant that are tagged with Coca-Cola.
Taxonomy tagging has a low configuration threshold for using AI with content and there is no extensive setup of a model or training. So I can see this being widely used by organisations to have more of their content tagged in libraries.. This will help in the Copilot/AI world to have more content tagged so AI can be more authorative.
Let me know how you get on with Taxonomy tagger and how you are using it in your organisation?