Efficient binary serialization of IFC models using HDF5

T.F. Krijnen, J. Beetz

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

327 Downloads (Pure)

Abstract

The Industry Foundation Classes (IFC) are a common file-based open standard to describe Building Information Models. An IFC file can describe a building model to a level of detail suitable for production use and unite information pertaining to all stakeholders involved in a construction project. IFC files can possibly constitute up to gigabytes of data. Processing the full extent of this data can be time consuming. Considering the multi-disciplinary nature of our industry it may also be unnecessary for the use case at hand. Therefore, the retrieval of relevant subsets, whether spatially, based on discipline, or others, is necessary to effectively consume such datasets in downstream applications.
However, prevalent encoding forms of IFC models are text-based. And even though, in terms of file size, the most prevalent encoding, called IFC-SPF, can be rather efficient, by nature, it does not facilitate random access seeking in the file and no ordering is imposed to the definition of elements in the file. Therefore, at worst, the entire file needs to be traversed in order to find instances of interest. Furthermore, text-based data is slow to parse in comparison to its binary equivalent.
This paper introduces a binary serialization for IFC models as an alternative to prevalent text-based formats. It is based on an existing open standard called HDF5. An implementation for the translation of conventional IFC instance models into HDF5 is provided under and open source license. HDF5 is a binary and hierarchical data format. The hierarchical nature allows random access to specific instances. Other benefits include transparent compression and mechanisms for linking and mounting external files. The compressed HDF5 format yields a significant reduction of file sizes as compared to IFC-SPF models. In three use cases is assessed that extracting data from the model, can occur in near-constant time in relation to the size of the model, contrary to linear time using IFC-SPF models.
The translation into HDF5 files follows an existing ISO standardized mapping from EXPRESS instance models, the parent standard of IFC. The self-documenting nature of HDF5 enables incorporating additional attributes that are not part of the schema. In order to improve visualisation one can cache calculated information such as triangulated geometry for complex CSG geometries that are computationally complex to compute. In addition, incorporating inverse attribute values as part of the instantiation allows to further optimize the generation of subgraphs.
Original languageEnglish
Title of host publicationICCCBE2016: 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016
Number of pages8
Publication statusPublished - 2016
Event16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016 - Osaka, Japan
Duration: 6 Jul 20168 Jul 2016

Conference

Conference16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016
Abbreviated titleICCCBE2016
CountryJapan
CityOsaka
Period6/07/168/07/16

Fingerprint

Industry
Geometry
Mountings
Visualization
Processing

Keywords

  • BIM
  • IFC
  • EXPRESS
  • HDF5
  • binary storage
  • file size
  • performance

Cite this

Krijnen, T. F., & Beetz, J. (2016). Efficient binary serialization of IFC models using HDF5. In ICCCBE2016: 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016
Krijnen, T.F. ; Beetz, J. / Efficient binary serialization of IFC models using HDF5. ICCCBE2016: 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016. 2016.
@inproceedings{6f073d5136564876819dc74a1abae42c,
title = "Efficient binary serialization of IFC models using HDF5",
abstract = "The Industry Foundation Classes (IFC) are a common file-based open standard to describe Building Information Models. An IFC file can describe a building model to a level of detail suitable for production use and unite information pertaining to all stakeholders involved in a construction project. IFC files can possibly constitute up to gigabytes of data. Processing the full extent of this data can be time consuming. Considering the multi-disciplinary nature of our industry it may also be unnecessary for the use case at hand. Therefore, the retrieval of relevant subsets, whether spatially, based on discipline, or others, is necessary to effectively consume such datasets in downstream applications.However, prevalent encoding forms of IFC models are text-based. And even though, in terms of file size, the most prevalent encoding, called IFC-SPF, can be rather efficient, by nature, it does not facilitate random access seeking in the file and no ordering is imposed to the definition of elements in the file. Therefore, at worst, the entire file needs to be traversed in order to find instances of interest. Furthermore, text-based data is slow to parse in comparison to its binary equivalent.This paper introduces a binary serialization for IFC models as an alternative to prevalent text-based formats. It is based on an existing open standard called HDF5. An implementation for the translation of conventional IFC instance models into HDF5 is provided under and open source license. HDF5 is a binary and hierarchical data format. The hierarchical nature allows random access to specific instances. Other benefits include transparent compression and mechanisms for linking and mounting external files. The compressed HDF5 format yields a significant reduction of file sizes as compared to IFC-SPF models. In three use cases is assessed that extracting data from the model, can occur in near-constant time in relation to the size of the model, contrary to linear time using IFC-SPF models.The translation into HDF5 files follows an existing ISO standardized mapping from EXPRESS instance models, the parent standard of IFC. The self-documenting nature of HDF5 enables incorporating additional attributes that are not part of the schema. In order to improve visualisation one can cache calculated information such as triangulated geometry for complex CSG geometries that are computationally complex to compute. In addition, incorporating inverse attribute values as part of the instantiation allows to further optimize the generation of subgraphs.",
keywords = "BIM, IFC, EXPRESS, HDF5, binary storage, file size, performance",
author = "T.F. Krijnen and J. Beetz",
year = "2016",
language = "English",
booktitle = "ICCCBE2016: 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016",

}

Krijnen, TF & Beetz, J 2016, Efficient binary serialization of IFC models using HDF5. in ICCCBE2016: 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016. 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016, Osaka, Japan, 6/07/16.

Efficient binary serialization of IFC models using HDF5. / Krijnen, T.F.; Beetz, J.

ICCCBE2016: 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016. 2016.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Efficient binary serialization of IFC models using HDF5

AU - Krijnen, T.F.

AU - Beetz, J.

PY - 2016

Y1 - 2016

N2 - The Industry Foundation Classes (IFC) are a common file-based open standard to describe Building Information Models. An IFC file can describe a building model to a level of detail suitable for production use and unite information pertaining to all stakeholders involved in a construction project. IFC files can possibly constitute up to gigabytes of data. Processing the full extent of this data can be time consuming. Considering the multi-disciplinary nature of our industry it may also be unnecessary for the use case at hand. Therefore, the retrieval of relevant subsets, whether spatially, based on discipline, or others, is necessary to effectively consume such datasets in downstream applications.However, prevalent encoding forms of IFC models are text-based. And even though, in terms of file size, the most prevalent encoding, called IFC-SPF, can be rather efficient, by nature, it does not facilitate random access seeking in the file and no ordering is imposed to the definition of elements in the file. Therefore, at worst, the entire file needs to be traversed in order to find instances of interest. Furthermore, text-based data is slow to parse in comparison to its binary equivalent.This paper introduces a binary serialization for IFC models as an alternative to prevalent text-based formats. It is based on an existing open standard called HDF5. An implementation for the translation of conventional IFC instance models into HDF5 is provided under and open source license. HDF5 is a binary and hierarchical data format. The hierarchical nature allows random access to specific instances. Other benefits include transparent compression and mechanisms for linking and mounting external files. The compressed HDF5 format yields a significant reduction of file sizes as compared to IFC-SPF models. In three use cases is assessed that extracting data from the model, can occur in near-constant time in relation to the size of the model, contrary to linear time using IFC-SPF models.The translation into HDF5 files follows an existing ISO standardized mapping from EXPRESS instance models, the parent standard of IFC. The self-documenting nature of HDF5 enables incorporating additional attributes that are not part of the schema. In order to improve visualisation one can cache calculated information such as triangulated geometry for complex CSG geometries that are computationally complex to compute. In addition, incorporating inverse attribute values as part of the instantiation allows to further optimize the generation of subgraphs.

AB - The Industry Foundation Classes (IFC) are a common file-based open standard to describe Building Information Models. An IFC file can describe a building model to a level of detail suitable for production use and unite information pertaining to all stakeholders involved in a construction project. IFC files can possibly constitute up to gigabytes of data. Processing the full extent of this data can be time consuming. Considering the multi-disciplinary nature of our industry it may also be unnecessary for the use case at hand. Therefore, the retrieval of relevant subsets, whether spatially, based on discipline, or others, is necessary to effectively consume such datasets in downstream applications.However, prevalent encoding forms of IFC models are text-based. And even though, in terms of file size, the most prevalent encoding, called IFC-SPF, can be rather efficient, by nature, it does not facilitate random access seeking in the file and no ordering is imposed to the definition of elements in the file. Therefore, at worst, the entire file needs to be traversed in order to find instances of interest. Furthermore, text-based data is slow to parse in comparison to its binary equivalent.This paper introduces a binary serialization for IFC models as an alternative to prevalent text-based formats. It is based on an existing open standard called HDF5. An implementation for the translation of conventional IFC instance models into HDF5 is provided under and open source license. HDF5 is a binary and hierarchical data format. The hierarchical nature allows random access to specific instances. Other benefits include transparent compression and mechanisms for linking and mounting external files. The compressed HDF5 format yields a significant reduction of file sizes as compared to IFC-SPF models. In three use cases is assessed that extracting data from the model, can occur in near-constant time in relation to the size of the model, contrary to linear time using IFC-SPF models.The translation into HDF5 files follows an existing ISO standardized mapping from EXPRESS instance models, the parent standard of IFC. The self-documenting nature of HDF5 enables incorporating additional attributes that are not part of the schema. In order to improve visualisation one can cache calculated information such as triangulated geometry for complex CSG geometries that are computationally complex to compute. In addition, incorporating inverse attribute values as part of the instantiation allows to further optimize the generation of subgraphs.

KW - BIM

KW - IFC

KW - EXPRESS

KW - HDF5

KW - binary storage

KW - file size

KW - performance

M3 - Conference contribution

BT - ICCCBE2016: 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016

ER -

Krijnen TF, Beetz J. Efficient binary serialization of IFC models using HDF5. In ICCCBE2016: 16th International Conference on Computing in Civil and Building Engineering, Osaka, July 6-8, 2016. 2016