TY - JOUR
T1 - An efficient binary storage format for IFC building models using HDF5 hierarchical data format
AU - Krijnen, Thomas
AU - Beetz, Jakob
PY - 2020/5/1
Y1 - 2020/5/1
N2 - The Industry Foundation Classes (IFC) are a prevalent data model in which Building Information Models can be exchanged, typically with a file-based nature. Processing the full extent of these models can be time-consuming. Considering the multi-disciplinary nature of the construction industry, stakeholders will typically only be interested in a small subset, depending on the purpose of the exchange. Therefore, the retrieval of relevant subsets, whether spatially, based on discipline, or others, is necessary to effectively consume such datasets in downstream applications. Prevalent encoding forms of IFC models are text-based and do not facilitate random-access seeking within the file and do not impose an ordering on the definition of elements within the file. Therefore, typically, the entire file needs to be read in order to find the data of interest. Furthermore, text-based data is slower to parse in comparison to binary data. This paper assesses a binary serialization format originating from the family of EXPRESS standards. It is based on an existing open, binary, hierarchical data format called HDF5 that allows random access to specific instances and therefore efficient retrieval of relevant subsets. The block-level, transparent compression yields a reduction of file sizes as compared to traditional serializations. Fully specified datatypes embedded in the exchange guarantee interoperable use. In this paper, several serialization profiles are introduced that cater to specific use cases by governing storage settings. Advanced functionality from the HDF5 library is applied to offer novel paradigms for fine-grained access rights, varying level of detail, revision management and aggregation of aspect models.
AB - The Industry Foundation Classes (IFC) are a prevalent data model in which Building Information Models can be exchanged, typically with a file-based nature. Processing the full extent of these models can be time-consuming. Considering the multi-disciplinary nature of the construction industry, stakeholders will typically only be interested in a small subset, depending on the purpose of the exchange. Therefore, the retrieval of relevant subsets, whether spatially, based on discipline, or others, is necessary to effectively consume such datasets in downstream applications. Prevalent encoding forms of IFC models are text-based and do not facilitate random-access seeking within the file and do not impose an ordering on the definition of elements within the file. Therefore, typically, the entire file needs to be read in order to find the data of interest. Furthermore, text-based data is slower to parse in comparison to binary data. This paper assesses a binary serialization format originating from the family of EXPRESS standards. It is based on an existing open, binary, hierarchical data format called HDF5 that allows random access to specific instances and therefore efficient retrieval of relevant subsets. The block-level, transparent compression yields a reduction of file sizes as compared to traditional serializations. Fully specified datatypes embedded in the exchange guarantee interoperable use. In this paper, several serialization profiles are introduced that cater to specific use cases by governing storage settings. Advanced functionality from the HDF5 library is applied to offer novel paradigms for fine-grained access rights, varying level of detail, revision management and aggregation of aspect models.
KW - BIM
KW - IFC
KW - EXPRESS
KW - HDF5
KW - Binary storage
KW - File size
KW - Performance
KW - Level of detail
KW - Federation
UR - http://www.scopus.com/inward/record.url?scp=85081063435&partnerID=8YFLogxK
U2 - 10.1016/j.autcon.2020.103134
DO - 10.1016/j.autcon.2020.103134
M3 - Article
VL - 113
JO - Automation in Construction
JF - Automation in Construction
SN - 0926-5805
M1 - 103134
ER -