Top Banner
Faculty of Civil, Geo and Environmental Engineering Chair of Computational Modeling and Simulation Prof. Dr.-Ing. Andr´ e Borrmann Faculty of Architecture Chair of Architectural Informatics Prof. Dr.-Ing. Frank Petzold VCS 4 CDE – Version Control Systems as Common Data Environments July 27, 2018 Report Advanced Topics in Building Information Modeling MichaelJ¨ager
27

VCS 4 CDE { Version Control Systems as Common Data …¤ger_VCS_as_CDE.pdf · 2018. 12. 13. · VCS 4 CDE { Version Control Systems as Common Data Environments 1.2Requirements The

Feb 02, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Faculty of Civil, Geo and Environmental Engineering

    Chair of Computational Modeling and Simulation

    Prof. Dr.-Ing. André Borrmann

    Faculty of Architecture

    Chair of Architectural Informatics

    Prof. Dr.-Ing. Frank Petzold

    VCS 4 CDE – Version Control Systems as

    Common Data Environments

    July 27, 2018

    Report

    Advanced Topics in Building Information Modeling

    Michael Jäger

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    Contents

    Introduction 1

    1 Requirements of the Construction Industry 2

    1.1 Data storage systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.1.1 Flat file systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.1.2 Document Management Systems . . . . . . . . . . . . . . . . . . . . . 4

    1.1.3 Building Information Models . . . . . . . . . . . . . . . . . . . . . . . 4

    1.1.4 Common Data Environments . . . . . . . . . . . . . . . . . . . . . . . 5

    1.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2.1 Distribution and performance . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2.2 Data safety and security . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2.3 Versioning and collaboration . . . . . . . . . . . . . . . . . . . . . . . 7

    1.2.4 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    1.2.5 Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2 Basics of Version Control Systems 10

    2.1 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    2.1.1 Working directories and commits . . . . . . . . . . . . . . . . . . . . . 10

    2.1.2 Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2.1.3 Distributed VCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.2 History and Popular Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page II

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    2.3 Supplemental Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    3 Potential use of VCS in construction 14

    3.1 Example Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.1.2 Planning Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    3.1.3 Execution Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    3.2 Tracking Changes with Diff . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    3.2.1 Experimental Version Comparison . . . . . . . . . . . . . . . . . . . . 17

    3.2.2 Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    4 Summary 20

    4.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    4.2 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    List of Figures 22

    List of Tables 22

    References 23

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page III

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    Introduction

    Common data environments (CDE) and Document management systems (DMS)

    serve as essential digital infrastructure for contemporary collaborative construction projects,

    especially those utilizing BIM. There are countless tailor-made solutions, some independent,

    some integrated with enterprise content management or groupware environments such as

    Microsoft Exchange or IBM Domino.

    One might compare this state to software development. Like the ACE industry, that

    sector is shaped by numerous contributors working on the same projects as well as needs for

    accountability and traceability. Unlike the ACE industry, the software sector has long relied

    on dedicated Version control systems (VCS) to provide just that. Those systems can

    record document edits, highlight changes and ensure, within limitations, identical data sets

    for each collaborator. Many among them, most famously git, are open source and work on top

    of regular file systems, allowing compatibility with and integration intro regular development

    software.

    This report investigates a) the construction industry’s requirements concerning DMS and

    CDE for BIM projects and b) capabilities of common VCSs. It examines possible use of

    the latter in the construction industry to distribute not only documents and plans but also

    bulding information models.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 1

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    Chapter 1

    Requirements of the Construction

    Industry

    With the professionalization of the construction industry its projects have taken on enourmous

    complexity. Contracts and specifications are measured in shelf-meters while plan binders fill

    entire warehouses for single projects. Even more difficult than efficiently accessing those large

    ammounts of information is distributing them among among planners, owners, authorities,

    contractors and numerous other stakeholders. Missing or outdated documents and plans are

    a if not the prevalent cause of delays, errors and cost increases.

    A distiction should be made between several types of data, which are henceforce all

    referred to as documents:

    contracts, agreements, regulations are not (or rarely) changed once finalized, usually

    before work starts. Any ammendments would be seperate documents. Frequent change

    of non-finalized documents in early project stages is to be expected.

    billing and accounting data is generated throughout the project. It is frequently up-

    dated, commented on or superseded, all the while being critical for legal and financial

    regards.

    plans are the core product of architects and engineers. While ideally any plan is complete

    once published and approved, in (central european) reality they are frequently updated

    and corrected. While insufficient supply of billing data inhibits cash flow and creates

    legal issues, inadequate distribution plans leads to costly mistakes and delays in actual

    construction.

    building information models differ greatly from plans, although they may seem super-

    ficially similar. Data harmonization and exchange is baked into the BIM approach,

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 2

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    eliminating many problems and error causes of traditional planning. Technical dif-

    ficulties in implementation remain but are actively being worked on in big-budgeted

    software companies in a highly competative and rapidly evolving market.

    1.1 Data storage systems

    Computer-aided data storage systems have proven inevitable for managing, storing and dis-

    tributing project data of any kind. Over the years many such systems have been developed,

    each with varying degrees of complexity and specificity.

    1.1.1 Flat file systems

    The simplest form are flat file systems. Emulating their physical namesake, they provide

    a simple hierarchical structure. Any document’s context is derived from its place within

    the hierachy and its name, which in itself is often codified in project- or organization-wide

    regulations. Adherence to hierachy and naming conventions is usually enforced manually.

    Access control – that is, limiting access to certeain documents to certain individuals or

    groups – is commonly available but regularly lacking in usability.

    Usually every stakeholder maintains their own document storage; yet documents need to

    be made available to all relevant stakeholders. Traditionally, this has been achieved through

    mail or fax. Fax, however, is widely considered outdated while sending documents via mail

    or curier takes time that could otherwise be used more productively. E-Mail on the other

    hand is fast but generally not legally binding. It is commonly used to notify recipients of

    incoming mail ahead of time.

    Data transfer via E-Mail places strict restrictions on file sizes. The simplest solution here

    is an internet-accessible file server using i.e. (Secure) File Transfer Protocol, hosted by a

    project party. That server may in turn be embedded into one or more stakeholders’ storage

    system, unless they are reluctant to rely on such a server as a replacement for their own

    storage on grounds of limited availability and performance — which they often and rightfully

    are.

    In practice, cloud based private-oriented shareware such as WeTransfer or Dropbox is

    used for singular data transfers if no stakeholder can or will provide such a server.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 3

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    1.1.2 Document Management Systems

    The flaws of flat file systems inspired the development of dedicated document management

    systems (DMS) as we know them today. A DMS in its prevailing definition is built around

    a database that stores metadata along with documents themselves. This metadata includes

    dates, authors, states (work-in-progress, approved, archived, . . . ), contexts of documents.

    The DMS creates indices of a documents content, either manually or through content

    recognition, to facilitate efficient data retrieval. It also allows users to lock and unlock files,

    marking them as being worked on and avoiding users overriding each others changes.

    1.1.3 Building Information Models

    Building information modeling aims to unify all relevant information about a building —

    starting with geometry but extending to cost and construction schedules, materials and qual-

    ity requirements, static models, physical simulations and environmental calculations.

    Figure 1.1: Building Information Model

    The concept of BIM introduces a whole new set of technical challenges to IT environments

    in the ACE sector while sharing all requirements associated with conventianal planning meth-

    ods. Every project’s planning data consists of informational resources 1. Each ressource is

    stored, accessed, locked and unlocked, edited, approved, etc. seperately and has their own

    metadata, eg. last-edited-at or approved-by. The end result of a project utilizing BIM is a

    single set of planning data.

    Early BIM projects strove for a single 3D-model enriched with metadata for elements,

    building parts and the entire project. The difficulties associated with allowing several project

    participants access to that model lead to a less centralized approach of multiple discipline-

    specific models that are federated in a coordination model according to predefined model

    views.

    Different extents and forms of BIM utilization are appropriate for different projects, since

    the necessary technology is not yet fully mature in all sectors. At the same time not all

    1the terms resource and document describe very similar concepts and are used interchangibly in this context

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 4

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    projects utilize BIM to its full potential yet. Many planners, owners and contractors are

    still sceptical of its advantages (rightfully citing inappropriate payment regulations), are

    unwilling to make the necessary adaptations to their business practices, or lack the expertise

    and personal and fincancial capacity to do so.

    1.1.4 Common Data Environments

    A more formalized concept that adapts the idea of a DMS into a so-called Common Data

    Environment (CDE) is provided in the british PAS 1192 [18] and the recently released ISO

    19650-1 [11] with very little difference between the two. These standards describe processes

    and structures for information exchange in a BIM environment. They do not demand a

    specific implementation, rather explain how and for what such an enviroment shall be used.

    Many existing software solutions are capable of serving as a CDE; software for that express

    purpose is in development.

    A CDE is a single place where all project participants store and exchange project related

    information. Inside this area, each team can work on their own ressources (so-called contain-

    ers) before submitting them for approval and cross-checking them with other stakeholders’

    resources. Notably, ISO 19650 extends the idea of a building information model to a project

    information model, including non-graphical data and documentation alongside graphical data

    that is federated 3D- (or more) models.

    There are four formalized states of a document: work-in-progress, shared, published and

    archived; with well-defined processes and authorizations to transfer documents between those

    states. The work-in-progress stage explicitly includes resources that are not ready to be shared

    with other project participants. In the next stage they are shared with the project team but

    not yet final, as they require harmonizing with other participants’ data. Once that is achieved

    and a resource is verified, it reaches the published stage, where it can be used for tender and

    construction. Especially the early stages are to be understood as iterative with work states

    being updated frequently [16].

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 5

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    1.2 Requirements

    The following subsections investigate feature requirements for data infrastructure in an ACE-

    environment. Some of these concepts are also explored in Annex A of ISO 19650-1, specifically

    simultaneous working, information security and information transmission.

    Once all those challenges are met the largest on remains: getting stakeholders to accept,

    trust and actually use the provided system. This can be achieved through contractual obli-

    gation, whose implementation lies within the project owner’s responsibility. The harder, but

    more sustainable way requires time, experience and and positive examples.

    1.2.1 Distribution and performance

    A fundamental problem of simple file-based document management: Every stakeholder main-

    tains their own document database in whichever form they see fit. This leads to inconsis-

    tencies whenever a change does not reach an affected party in time or at all. Instead, it

    is important that all participants access the same versions of the same files (the exception

    being, of course, current work in progress).

    Just as important as synchronisation is uninterrupted availability to all stakeholders at any

    time. Any down time limits or prohibits (depending on the level of integration) productivity.

    The conventional flat file system approach places responsibility for that with each party’s own

    IT – which is convenient from a legal perspective but may not be a good thing for smaller

    parties without a dedicated IT department.

    There are two main approaches to guaranteing availability: one or more central servers

    or automatic distribution among mirrored repositories.

    Any centralized system is constrained by a required performance level. The server needs to

    be able to supply the expected number of clients simultaniously without causing unacceptable

    delays. Working on a remote server may not be an option at all for parties without a fast

    internet access.

    A decentralized system maintains several independent copies of the data set. Changes

    are regularly copied between the instances. Such systems require more hardware and more

    complex software for managing changes of the different instances.

    1.2.2 Data safety and security

    For legal, contractual and practical reasons, construction plans and other documents needs

    to be archived for a long time, sometimes as long as the building exists. Throughout that

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 6

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    time, no irreversible damage must come to the data and it must remain legible. Nowadays

    more and more clients demand a building information model that can be transfered into an

    asset information model representing the built product to simplifiy facility management.

    A paper archive is often viewed as safer than a server, yet both are equally susceptible to

    force majeure or physical sabotage. Depending on the physical storage medium, power failure

    might actually compromise data safety, though. A competent IT department will prevent

    data loss through regular off-site backups and long term archiving on tape drives.

    File formats, software standards and storage mediums evolve over time. Care needs to be

    taken that digital data, if it is to be used in place of printouts, is stored in useable file formats

    on storage mediums that remain accessible even after decades of technological development.

    Storing PDF/A files along with native file formats is a promising approach while the cloud

    has the potential to eliminate the need for maintenance of outdated storage hardware, as

    data storage becomes an abstract service.

    Security is at least as important as safety. Limited by contractual obligations and prac-

    tical necessity, most data in a construction project constitutes a company secret. Contract

    data is usually only available to the contracting parties. This extends to billing information,

    as knowledge of a competitors prices offer a great advantage to any company in subsequent

    tenders. Security-related information may also be classified, ie. for prisons or military instal-

    lations. Even most file systems support at least basic access control. In addition, encryption

    may be used for especially delicate documents.

    1.2.3 Versioning and collaboration

    Computer systems must aid communication, coordination and collaboration, either direct

    from partner to partner or indirect through editing shared resources.

    Any building project is subject to evolution; each piece of information may be created,

    approved, adapted or invalidated by several contributors. All of these changes need to be

    documented, attributed to their author and possibly reverted.

    Conventionally, this is achieved in two distinct ways: For plans a version number is printed

    on it along with that version’s author, its creation date and notable changes. Drafts of textual

    documents commonly have each version saved as a seperate document, commonly with the

    creation date and other various pre- and suffixes as part of the file name. Both situations lead

    to many versions of the same document often appearing side by side. It is easy to confuse

    versions, leading to duplicate work or overlooked changes.

    Many modern document management systems have integrated version control. They

    allow users to freely browse previous versions and the previously mentioned accompanying

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 7

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    data and even to reset a document to an earlier state. Some users are still reluctant to rely on

    these versioning systems, be it for lack of experience, uncertainty regarding their reliability

    or their prevailingly non-liable nature.

    1.2.4 Concurrency

    Whenever a resource is shared, multiple users might want to edit the same resource at the

    same time. The resulting inconsistencies can be avoided with either pessimistic or optimistic

    concurrency.

    A DMS with pessimistic concurrency requires a user to check out (or lock) a document

    before gaining write-access and check in (or unlock) it once they are done. In some systems

    this happens automatically. No other user may edit a locked document.

    In an optimistic collaboration system users are permitted to edit a resource simultaniously.

    Such a system will assume that most changes won’t conflict with each other and can therefor

    be merged into a single updated version automatically. Conflicts that do occur need to be

    manually resolved. This approach works well for textual data, since changes are limited to

    specific lines of text. Whether this extends to the STEP-based IFC format as well will be

    examined in a later chapter.

    1.2.5 Aggregation

    An important influece on a project’s specific challenges is its level of aggregation, or what is

    to be considered an individual ressource.

    The introductory despription in section 1.1.3 fits a concept more precisely called BIG

    BIM. Its counterpart little bim utilizes specific software for specific isolated tasks within a

    project (fig. 1.2). The resulting models are self-contained ressources comparable to plans and

    documents in traditional planning. Depending on project complexity, the level of aggregation

    in BIG BIM is either much higher or much lower.

    With a high level of aggregation an entire building may be represented in a single or few

    models, each representing a partial model. The model may be subdivided by zone (floor,

    building section), by domain, ie. functional aspects such as an architectural, structural,

    HVAC model, or even not at all. Partial models are coordinated and cross-checked (ie.

    federated) with each other using model checkers that scan for collisions and conflicts and

    assembled into coordination models according to predefined model views.

    The lowest level of aggregation on the other hand treats single building elements, pieces of

    information or even paragraphs in textual documents as objects. Managing so many objects

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 8

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    Figure 1.2: Open vs. Closed and little vs. BIG describe different approaches to BIM

    requires a specialised BIM-server or product model server. A BIM-server is usually integrated

    seamlessly into their respective deleveloper’s BIM software as part of a closed BIM solution.

    A product model server on the other hand uses open standards to provide ressources to

    all kinds of clients, be it modelling software or a browser-based web interface – allowing

    for an Open BIM approach. Both store their resources in a database and handle resource

    managment internally.

    The higher the level of aggregation, the fewer ressources need managing. On the other

    hand, this makes concurrent work more difficult and possibly less efficient, especially with

    pessimistic forms of collaboration. The lower the level of aggregation, the more fine-grained

    locking or approval operations will be.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 9

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    Chapter 2

    Basics of Version Control Systems

    A Version Control System is a software system that

    - allows multiple users to work on the same files simultaniously,

    - prevents or harmonizes conflicts in file changes arising from that,

    - stores every version of every file along with its editor, timestamp, etc.

    The files a VCS works with are stored in a special location called repository. Unlike

    a two-dimensional file system (name and directory) a repository identifies files by name,

    subdirectory and time stamp.

    2.1 Functionality

    There is a set of commands that are shared by virtually all VCSs [19]. This section briefly

    explains the basic features. Note that not all systems use the same syntax; so the most

    common terms are used.

    2.1.1 Working directories and commits

    Files are not edited directly in the repository. Instead, a working copy is created with the

    checkout command. The changes a user makes are transfered to the repository with the

    commit command. Since the repository may have been changed by other users, the working

    copy can be updated to retrieve those changes. A single commit is the atomic unit of change

    in a data set that is archived in the repository. This record is called commit history.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 10

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    Before commiting changes, a user might wish to review a summary of those changes. This

    is done with the diff command that compares versions of a file with each other line by line.

    2.1.2 Branching

    Early VCSs had a linear commit history: each version builds upon the previous commit and

    there is always a current commit. However, this doesn’t represent the actual workflows of

    not only software developers, but engineers and architects as well. Instead, users work on

    different versions of a project and its repository in parallel (eg. after version 3.0 of a software

    is released, some developers work on version 3.1 and others on 4.0). This is protrayed by

    branching.

    The branch command creates a copy (ie. a branch) of the repository. Each branch can

    be commited to separately. The merge command then attempts to combine the changes in

    both branches into one version. This may or may not work automatically – as changes often

    conflict with each other – and remains one of the largest challenges VCS users and developers

    face. Before merging the differences can be reviewed with diff.

    Branch and commit history are often visualized in Directed Acyclic Graphs or DAGs

    (fig. 2.1). Each node represents a commit, each linear sequence of nodes a branch.

    While branching more accurately represents a developer’s workflow, it also complicates

    project management significantly because there no longer is a definitive latest version. This is

    why conventionally, most projects have a master branch, one or more development or feature

    branches (simplified). The master branch is only changed when a development branch is

    merged into it and only when those are sufficiently stable. A release version may be yet

    another separate branch from the master.

    Figure 2.1: Example of a Directed Acyclic Graph

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 11

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    2.1.3 Distributed VCS

    The last fifteen years have seen the emergence of Distributed Version Control Systems

    (DVCS). Those systems have multiple repositories. There may still be a main repository,

    but in essence each works independently from each other.

    The clone command creates a local instance of the same repository, identical to the

    original.

    The push command attempts to copy the changes in a local repository into a remote one.

    It only succeeds if the remote repositories contains no commits that the local one doesn’t.

    The pull command synchronizes a local with a remote repository by merging a copy of

    the remote one with the local one. It is used whenever the remote repository has received

    commits after the local one has been cloned (for example by receiving a merge from another

    branch). The pull command therefor often needs to be called before a push, lest the push

    fails. For both push and pull a user must specify which branches on both instances they wish

    to push or pull from and to.

    It is common to have branches specifically to pull to from remote repositories to avoid

    merging across repositories. Cloning, pushing and pulling each usually use SSH and require

    user authentification. It is possible to limit access for specific users to certain parts of the

    repository.

    Figure 2.2: A Distributed VCS

    DVCS have several notable advantages over centralized systems:

    private They compartmentalize teams. Each group within a team (or even every user) can

    have their own repository instance and adjust its structure to their workflow.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 12

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    offline They allow for geographically distributed teams to work together, even when internet

    connection is interrupted or to slow for constant data exchange.

    safe If one repository is compromised, it can easily be restored from another instance.

    2.2 History and Popular Systems

    One of the earliest modern VCS capable of handling multiple files was Concurrent Versions

    System (CVS). It was a centralized system with automated merging that didn’t support

    branching. Its successor is Apache Subversion (SVN), which gives each commit an absolute

    reversion number; each file in the repository has the revision number of the last commit it

    was changed by. It also creates a local copy of the repository on checkout or update, allowing

    a user to review his changes even without connection to the repository.

    Around 2005 two distributed VCS entered the market: Mercurial (hg, from mercury) an

    git. Both are primarily used via Command Line Interfaces. While git is more flexible, it is

    also more complex and difficult to learn then Mercurial[1].

    Reliable data for market distribution today is hard to come by and mostly based on pos-

    sibly biased sources. One source indicates that git has become the dominating solution based

    on the number of StackExchange questions and Google Trends with ca. 75% of questions

    about VCS regarding git[20]. Most sources conclude that git is at least among the mar-

    ket leaders an keeps on gaining popularity (especially in open source projects) with Apache

    Subversion and Mercurial remaining the strongest contenders[1][3].

    2.3 Supplemental Software

    Alongside the different programs for VCS a number of supplementary webservices have been

    developed, such as GitHub, GitLab, BitBucket and SourceForge. Those are at their core file

    hosting services that provide a central repository for a DVCS – either public (open source)

    or private – along with additional features. GitHub for example offers tools to review and

    manage changed code and enables web-based documentation and distribution [10]. Meanwhile

    GitLab aids in project managament with task and issue tracking and is optimized for the

    devOps cycle, ie. simultanious development and execution [9].

    Many development environments have VCS interfaces that allow users, even those unfa-

    miliar with command line interfaces, to use VCS commands from within their software. This

    extends to branch selection, commit comments and code review with diff, pushing the actual

    systems into the background.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 13

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    Chapter 3

    Potential use of VCS in

    construction

    3.1 Example Project

    3.1.1 Overview

    Whether and how a VCS can serve as a CDE is best explained in an example. Let us consider

    a fictitious construction project with the following participants:

    Figure 3.1: fictitious project diagram

    In each office several individual planners work on different parts of the project while the

    architects office is responsible for planning coordination

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 14

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    3.1.2 Planning Stage

    C tasks A with model coordination and agrees to use git as common data environment. A

    creates a central repository with the master branch that holds the general model. They set

    their designers to work on a separate architectural branch. Each designer works on their own

    local repository as each has a distinct style of working – some like to commit multiple times

    a day, others only once every few days – while some prefer working from home. In any case

    they regularly push their work to the architectural branch on A’s central repository after

    pulling from the same to make sure they don’t create conflicts on it.

    Meanwhile, The client C supervises the design process by pulling each commit from

    the architectural branch of A’s central repository to his own local repo. After each review

    meeting, the reviewed design state in the architectural branch on the central repo is merged

    into the master branch. As variants are being explored, a separate branch is created for each.

    Only the accepted variant is merged into the master branch.

    As the design stage progresses, S and M join the project. A creates two new branches on

    the central repo for structural and MEP planning respectively. Both S and M create local

    repos and clone the central repo from A including the master branch. They work on their

    specific repos independently but regularly pull revisions from the master branch an push

    their own changes to their respective branches in A’s repo. As model coordinator, A remains

    responsible for merging major commits to the architectural, structural and MEP branches

    into the master branch.

    Branching can be executed partially, so the master branch as the coordination model

    would contain all parts of the model while the discipline-specific branches contain only files

    relevant for that discipline.

    3.1.3 Execution Stage

    Eventually tendering begins. A creates another brach and modifies it to be published as

    tender document.

    The contract is subsequently awarded to G, who creates their own repository and clones

    the master repository, which has become the basis of their contract. This clone can again be

    cloned by the subcontractors (partially if necessary).

    As the planning continues after the awarding, the general contracotr pulls commits from

    the master branch and his subcontractors (in part) from him. The diff report for each pull

    can serve as a basis for claim management as it summarizes everything that has changed

    from one pull to antoher. For example: The Electrician S2 wishes to propose a change. They

    create a new branch from the MEP branch for their offer. G convinces C to commission the

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 15

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    offer. The new branch is then merged into the MEP branch and subsequently into the master

    branch.

    Figure 3.2: Example DAG

    This brief example is obviously vastly simplified but it demonstrates the basic idea of em-

    ploying a VCS as a Common Data Environment. One significant challenge will be examined

    in the following section.

    3.2 Tracking Changes with Diff

    Since IFC is a text-based format, it appears reasonable to compare versions of a model with

    the Diff- Command.

    However, Diff tools are designed for program code, where the semantic purpose of a line

    of code are reasonably apparent when viewed within its immediate context. A single line in

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 16

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    an IFC file on the other hand represents a single data entity such as a vertex or line and

    may contain references to many other entities at unrelated positions in the file. A designer

    examining the raw data therefor cannot practically infer the meaning or context of such an

    object, let alone changes to them.

    One might consider developing a plugin for an IFC-Viewer that highlights the identified

    changes within the 3D-View. This could be hindered by the generation of the IFC files

    themselves: That process is part of the (usually proprietary) native software used to design the

    model. Because that process is not standardised it is not necessarily consistent. Furthermore

    it involves randomly created identifiers. Exporting identical models does not create identical

    IFC files.

    3.2.1 Experimental Version Comparison

    The extent of these challenges was examined using Autodesk Revit 2019 and its preinstalled

    architectural sample project with 1463 elements:

    - It was exported to IFC three times without any modification (ver1, ver2, ver3).

    - Another view was opened and the model was exported again (ver4).

    - A wall opening was then added and the model exported once more (ver5).

    - The new object was deleted and another version was exported (ver6).

    - A wall was removed an then reinserted at the same position (ver7)

    Each version was then compared with every other one and the results saved to text files

    using the Windows command FC /1 ver1.ifc ver2.ifc > 12.txt. A text editor was then

    used to count the changes based on the occurence of the headlines. the results can be seen

    in table 3.1

    IFC-File 1 2 3 4 5 6 7 lines bytes

    1 - 2031 1629 1574 2074 1988 157 531529 268776802 - 1876 2237 2005 2061 105 531529 268776803 - 1833 1625 1865 105 531529 268776804 - 2090 1654 157 531529 268776805 - 2146 159 531545 268787486 - 157 531529 268776807 - 531529 26877666

    Table 3.1: differences between and sizes of IFC files

    It is striking that not only there are many differences even between IFC files that should

    be identical, but the number of differences varies widely as well. This stems from the way the

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 17

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    diff tool interprets several changes in close proximity to each other as a single change. The

    differences between versions 1 through 4 and version 6 stem only from randomly generated

    22-character alphanumerical string identifiers (and metadata such as creation date). The

    number of changes appears has no statistically relevant corelation with whether the file was

    changed at all. Changing the viewport does not appear to have any impact beyond these

    identifiers, as version 4 has the same file size. The exception is version 7, where the differences

    in the file to the others where so numerous that the tool used counted differences in dozens

    of subsequent lines as a single change, hence the low difference count.

    Another tool called P4Merge was then used to create a three-way comparison between

    versions 1, 2 and 7 in an attempt to filter the changes corresponding only to identifiers (it

    should be noted that this process took several minutes on an average computer). This tool

    identified 2822 differences unique to version 7 - stemming from one edit that is not even

    recognisable or semantically relevant within the model itself.

    3.2.2 Insights

    These observations reveal significant challenges for the prospect of developing the proposed

    software solution. The steps necessary for such a plugin would be the following:

    1. list differences between two or more versions of an IFC file – this can be accomplished

    using existing algorithms.

    2. filter out changes that are limited to randomly generated indentifiers – abstracting

    identifiers is a basic task of every software compiler and should not be too complicated

    to transfer to building information models.

    3. filter out objects that are syntactically different but semantically identical – Especially

    this task requires either complex algorithms or could be approached with machine

    learning or other artificial intelligence paradigms. The latter would require an AI with

    a semantic understanding of building information models – a bold requirement, yet one

    that would be useful far beyond version comparison.

    4. list the remaining differences that represent actual changes to the model

    5. highlight those changes in 3D-view and model structure tree

    One way to circumventing the problem of semantically irelevant changes could be a propo-

    sition introduced by Koch and Firmenich [13, 14]: They define a language to describe Building

    Information Models based on changes, not on states. This language records a sequence of

    modeling operations as opposed to a set of objects. It may more accurately reflect a designer’s

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 18

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    intent in fewer data points and, by way of being a structured sequence instead of an unstruc-

    tured list, making purely syntactical changes unnecessary or at least obvious. However, its

    implementation requires an extension of the IFC standard, which may harm the spread of

    any tools that utilize it.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 19

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    Chapter 4

    Summary

    4.1 Limitations

    The sections above assume that the IFC standard is employed at least for shared ressources.

    While that is an idealistic assumption, not all project owners are prepared to forgo the

    advantages of closed BIM. Commiting native formats to the repository is impractical, as

    those binary formats can not be effectively processed by VCSs. For each commmit the entire

    model would have to be archived instead of just the changes since the previous one. This

    would lead to unacceptable storage requirements and defeat the entire purpose of VCS in the

    first place. The VCS approach is therefor not applicable to closed BIM environments without

    major adaptation of the used VCS software.

    The considerations in this report focus primarily on the model (however many dimen-

    sions it may contain). While that could include cost and scheduling information, structural

    and environmental analysis data and more, it excludes documents separate from the model.

    Such documents could include invoices, notices of concern or delay and other legally binding

    documents. These documents are issued independently from the planning process and each

    other and are never edited. In large projects, dozens of such documents change hands each

    day. It would be possible to create a commit for each such document, leading to a possibly

    very long commit history. Those might be stored in a separate repository – the concept of a

    common data environment not really intact. This is clearly not what VCS are designed for.

    4.2 Outlook

    Because each project partner has a separate repository, it is quite difficult to alter records

    of previous transactions. If a client desires addidtional protection from hampering with data

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 20

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    records it is possible to store each commit in a blockchain that is distributed to eac project

    participant. In a blockchain the validity of an entry can only be verified if all preceding

    entries in all versions of the chain are valid as well. If someone wanted to modify an entry,

    they would have to perform computationally intensive operations on more than half of the

    copies of the record.

    To summarize, Version Control Systems fulfil many of the requirements to Common Data

    Environments. Costs will arise for the necessary development effort to capitalize on the

    advantages of VCS along with an expected limitation in ease-of-use of the non-construction-

    specific software and thus a less steep learning curve. It remains doubtful whether the saved

    license fees compared to specialized ISO 19650-compatible software would offset these costs.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 21

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    List of Figures

    1.1 Building Information Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2 Open vs. Closed and little vs. BIG describe different approaches to BIM . . . 9

    2.1 Example of a Directed Acyclic Graph . . . . . . . . . . . . . . . . . . . . . . 11

    2.2 A Distributed VCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    3.1 fictitious project diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.2 Example DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    List of Tables

    3.1 differences between and sizes of IFC files . . . . . . . . . . . . . . . . . . . . . 17

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 22

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    References

    [1] 2018 version control software comparison: svn, git, mercurial. 2018. url: https : / /

    biz30.timedoctor.com/git-mecurial-and-cvs-comparison-of-svn-software/ (visited on

    05/24/2018).

    [2] aiim. What is document management. 2018. url: https://www.aiim.org/What- Is-

    Document-Imaging.

    [3] Best version control systems. 2018. url: https://www.g2crowd.com/categories/version-

    control-systems (visited on 05/24/2018).

    [4] André Borrmann, Markus König, Christian Koch, and Jakob Beetz. Building informa-

    tion modelin. In 2015. Chapter 12 - Kooperative Datenverwaltung, pages 207–236.

    [5] Ed Boxall. Common data environment (cde): what you need to know for starters.

    August 26, 2015. url: https://www.aconex.com/blogs/common-data-environment-

    cde-tutorial.

    [6] Wibke Cartensen. A brief history of version control. 2016. url: https ://www.red-

    gate.com/blog/database-devops/history-of-version-control (visited on 05/27/2018).

    [7] Martin Fiedler. Lean Construction - Das Managementhandbuch. 2018.

    [8] Berthold Firmenich, Christian Koch, Torsten Richter, and Daniel G. Beer. Versioning

    structured object sets using text based version control systems. In Proceedings of the

    22nd CIB-W78 Conference on Information Technology in Construction, 2005.

    [9] GibLab. The only single product for the complete devops lifecycle. 2018. url: https:

    //about.gitlab.com/ (visited on 06/26/2018).

    [10] GitHub. The worlds leading software development platform. 2018. url: https://github.

    com/ (visited on 06/26/2018).

    [11] Organization of information about construction works — Information management us-

    ing building information modelling — Part 1: Concepts and Principles. Specification,

    International Standardisation Organisation, 2017.

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 23

    https://biz30.timedoctor.com/git-mecurial-and-cvs-comparison-of-svn-software/https://biz30.timedoctor.com/git-mecurial-and-cvs-comparison-of-svn-software/https://www.aiim.org/What-Is-Document-Imaginghttps://www.aiim.org/What-Is-Document-Imaginghttps://www.g2crowd.com/categories/version-control-systemshttps://www.g2crowd.com/categories/version-control-systemshttps://www.aconex.com/blogs/common-data-environment-cde-tutorialhttps://www.aconex.com/blogs/common-data-environment-cde-tutorialhttps://www.red-gate.com/blog/database-devops/history-of-version-controlhttps://www.red-gate.com/blog/database-devops/history-of-version-controlhttps://about.gitlab.com/https://about.gitlab.com/https://github.com/https://github.com/

  • Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments

    [12] Organization of information about construction — Information management using

    building information modelling — Part 2: Delivery phase of the assets. Specification,

    International Standardisation Organisation, 2017.

    [13] Christian Koch. Bauwerksmodellierung im kooperativen Planungsprozess: Mit der

    Objektorientierung zur Verarbeitungsorientierung. Dissertation, Bauhaus-Universität

    Weimar, July 2, 2008.

    [14] Christian Koch and Berthold Firmenich. An approach to distributed building modeling

    on the basis of versions and changes. In Walid Tizani and Michael J. Mawdesley, editors,

    Advanced Engineering Informatics. 2011.

    [15] Richard McPartland. What is the common data environment. October 18, 2016. url:

    https://www.thenbs.com/knowledge/what-is-the-common-data-environment-cde.

    [16] Fred Mills. What is a common data environment? July 15, 2015. url: https://www.

    theb1m.com/video/what-is-a-common-data-environment.

    [17] Mohamed M. Nour, Berthold Firmenich, Torsten Richter, and Christian Koch. A ver-

    sioned ifc database for multi-disciplinary synchronous cooperation. In Joint Interna-

    tional Conference on Computing and Decision Making in Civil and Building Engineer-

    ing, June 14, 2006.

    [18] Specification for information management for the capital/delivery phase of construc-

    tion projects using building information modelling. Specification, Construction Industry

    Council, 2013.

    [19] Eric Sink. Version control by example. 2011. url: https://ericsink.com/vcbe/html/

    index.html (visited on 05/27/2018).

    [20] Version control systems popularity in 2016. 2016. url: https : / / rhodecode . com /

    insights/version-control-systems-2016 (visited on 05/24/2018).

    Chair of Computational Modeling and SimulationChair of Architectural Informatics

    Page 24

    https://www.thenbs.com/knowledge/what-is-the-common-data-environment-cdehttps://www.theb1m.com/video/what-is-a-common-data-environmenthttps://www.theb1m.com/video/what-is-a-common-data-environmenthttps://ericsink.com/vcbe/html/index.htmlhttps://ericsink.com/vcbe/html/index.htmlhttps://rhodecode.com/insights/version-control-systems-2016https://rhodecode.com/insights/version-control-systems-2016

    IntroductionRequirements of the Construction IndustryData storage systemsFlat file systemsDocument Management SystemsBuilding Information ModelsCommon Data Environments

    RequirementsDistribution and performanceData safety and securityVersioning and collaborationConcurrencyAggregation

    Basics of Version Control SystemsFunctionalityWorking directories and commitsBranchingDistributed VCS

    History and Popular SystemsSupplemental Software

    Potential use of VCS in constructionExample ProjectOverviewPlanning StageExecution Stage

    Tracking Changes with DiffExperimental Version ComparisonInsights

    SummaryLimitationsOutlook

    List of FiguresList of TablesReferences