Custom BIM 360 search with Elasticsearch

To implement a fast search we need to first index the data. It seems simple but involves 3 main steps: find the data, send it to the database and keep the data up to date. Forge allows us to find and collect metadata on CAD drawings, and to get notifications for new data, so we need a search database.

Elasticsearch

There are many ways to run Elasticsearch, which is a free, open-source software. But one of the easiest ways is to use it as a service on AWS. You can also run it under EC2 or on your own infrastructure. Here we will use the AWS service.

But how it works? With an AWS Account, go to Elasticsearch Service and create a new instance with all default settings, after a few minutes the instance should be ready to use. Make sure to apply the proper access policy, this sample uses access key & secret so you can connect locally (testing).

Data to search

Forge allows us to access metadata of CAD files, but what king of data? Open a file, click on any element (or element of interest) and look at the property panel: that’s the data we can search on. And it should be useful for most use cases.

Using Model Derivative API an app can access the entire metadata of files. With this API, the data should come in the form of propertyName & propertyValue. This is super complete and descriptive, but can result in a lot of duplicated text as the property name will repeat for every element on the file.

Now the question is: do you want to search for a property name or for the property value? For a generic search, we don’t need all of that, we just need the property value, right? Of course, you may argue that you do need to search on the property name, which is totally fine and possible.

To conclude this section: as we don’t need the duplicated values and just need the property values, there is an undocumented way of accessing the Viewer property database. This was actually explored at this other article, and we’re using the objects_vals.json file for it.

Indexing

To index a large amount of data we need to break down the task. There are several reasons for it, but the two most important are: retry just the part that fails and avoid rate-limit errors.

Retry is important as several things can fail, from the data being inaccessible to connectivity problems. For this sample let’s break down the operation per folder (and per subfolder) and per file inside each folder. If any of those fails, we just retry that folder or file.

As we’re talking about long tasks we’re also talking about queuing, right? There are many ways to manage queues, including services from AWS or Azure. For this article sample, I’m using Hangfire, which is a .NET open-source library. Among other things, it allows us to manage how many jobs to run in parallel and retry in case of exceptions. Ultimately it helps manage rate-limit.

For this article and sample, let’s limit to the “Project Files” folder.

What about non-CAD files?

This indexing process should also work for non-CAD files, or file where the objects_vals.json is not available, like PDFs. To handle those formats, we would need to read them and extract the text, which is not part of this article and sample.

Format the data

As we want to search on BIM 360 Document Manager files we need to be able to link back to those files. When we open a file on BIM 360 UI we can see the URL of a given files:

https://docs.b360.autodesk.com/projects/{projectId}/folders/{folderUrn}/detail/viewer/items/{itemUrn}

When showing the search results, we also need the versionUrn to get the thumbnail and fileName to display.

Finally, as Elasticsearch store documents, we need to define our documentId, so let’s use the itemUrn, which is the same for all versions of that file. This leads us to only store the last version of a file, meaning our search will only find hits on the last version.

Keeping the data up to date

The indexing process takes a lot of time and resources. And we don’t need to index all files more than one time, just need the new files (or versions), right? So let’s use Webhooks to notify our app of any new file or version, get its metadata and update the respective document on Elasticsearch.

During the indexing process let’s add a webhook for dm.version.added on the “Project Files” level, which should capture new files on any subfolder.

Note that the webhook callback can happen any time and we will need a valid access token to access the file metadata.

Managing access & refresh tokens

The entire indexing operations and later updates (via Webhooks) require a 3 legged token. We need to keep a database with the current access token, when it expires, the respective refresh token and when it expires. On every Forge access to Data Management, let’s look at the access token, use it if still valid, if not use the refresh token to get a new one and store the new refresh token. For that we’ll use a MongoDB database where a documentId is the Autodesk UserId that started the indexing process.

About the code

The .NET Core C# code for this sample is divided into some key files:

DataManagementCrawlerController: starting from the Account/Hub level, look at every project and folder. For each file found it queues up a Model Derivative job
ModelDerivativeController: get the file metadata and send it to Elasticsearch.
OAuthController & OAuthDatabase: manage access & refresh token in session and in database
Webhook & WebhookController: register the hooks and receive the callbacks

Seach is a big and deep topic, this is just a first look at it. I hope you find this sample useful to get started.

Tags:

Code Samples

1 Apr 2025

AEC Data Model API: 2 new queries including elementGroupExtractionStatusAtTip and propertyDefinitionsByElementGroup

We are pleased to inform you about 2 new AEC Data Model queries including elementGroupExtractionStatusAtTip and...

Cookie preferences

Your privacy is important to us and so is an optimal experience. To help us customize information and build applications, we collect data about your use of this site.

May we collect and use your data?

Learn more about the Third Party Services we use and our Privacy Statement.

Are you sure you want a less customized experience?

We can access your data only if you select "yes" for the categories on the previous screen. This lets us tailor our marketing so that it's more relevant for you. You can change your settings at any time by visiting our privacy statement

Your experience. Your choice.

We care about your privacy. The data we collect helps us understand how you use our products, what information you might be interested in, and what we can improve to make your engagement with Autodesk more rewarding.

May we collect and use your data to tailor your experience?

Explore the benefits of a customized experience by managing your privacy settings for this site or visit our Privacy Statement to learn more about your options.

Back to Community Blog

Custom BIM 360 search with Elasticsearch

Elasticsearch

Data to search

Indexing

What about non-CAD files?

Format the data

Keeping the data up to date

Managing access & refresh tokens

About the code

Related Article

AEC Data Model API: 2 new queries including elementGroupExtractionStatusAtTip and propertyDefinitionsByElementGroup

Resources

Cookie preferences

Strictly necessary – required for our site to work and to provide services to you

Improve your experience – allows us to show you what is relevant to you

Customize your advertising – permits us to offer targeted advertising to you

Strictly necessary – required for our site to work and to provide services to you

Improve your experience – allows us to show you what is relevant to you

Customize your advertising – permits us to offer targeted advertising to you

Are you sure you want a less customized experience?

Your experience. Your choice.

Elasticsearch

Data to search

Indexing

What about non-CAD files?

Format the data

Keeping the data up to date

Managing access & refresh tokens

About the code

Related Article

AEC Data Model API: 2 new queries including elementGroupExtractionStatusAtTip and propertyDefinitionsByElementGroup

Resources

Cookie preferences

Strictly necessary – required for our site to work and to provide services to you

Improve your experience – allows us to show you what is relevant to you

Customize your advertising – permits us to offer targeted advertising to you

THIRD PARTY SERVICES

Strictly necessary – required for our site to work and to provide services to you

Improve your experience – allows us to show you what is relevant to you

Customize your advertising – permits us to offer targeted advertising to you

Are you sure you want a less customized experience?

Your experience. Your choice.