"Most people consider a company OSS when it contributes code to an OSS project, but nowadays a significant value of open source lies in non-code contributions...We should start thinking more about how to study non-code contributions, and how this relates to the commercialization of open source projects (and not only software)."
Carlo Daffara, the Italian representative of the European Working Group on Libre software
The Open Source Definition or OSD defines the criteria to which the distribution terms of software must comply for it to be deemed to be open source software. The term "open source", however, is used to label a broad assortment of phenomena that fall well outside the established OSD. In addition, there is ambiguity in what is meant to be covered by the terms "source" and "open".
We envisage a definition of open source that equally applies to software, hardware schematics, content, and processes, not just software.
To advance our understanding of open source, we argue that we need to:
- Define open source in terms of the attributes of the systems used to produce, use, and distribute assets
- Act to strengthen the soundness, vitality, and proper functioning of these systems
What is meant by "source"?
The term "open source" was invented as a marketing term in 1998. Proponents of the term "open source" successfully argued that the term "free software" was fraught with challenges that included it being ambiguous and it being disliked by corporations. Several months ago, a talented software professional challenged our use of the term open source when referring to a software application released under the BSD licence, an open source licence which complies with the OSD. The professional argued that the software was "open code", but not "open source". Intrigued, we asked for clarification.
It was explained to us that the code was produced by a single private organization that periodically bundled together a release and published it on Sourceforge.net. While the code was released under the BSD license, the production of the code lacked key open source characteristics such as:
- No external contributors: all code was developed in-house prior to being published on the Internet
- No visibility of who developed the code and when they developed it
- No mechanisms were available to the general public for (i) contributing to the production of the code prior to its release in Sourceforge.net or (ii) participating in the governance structure of the organization that produced it
In short, the code was open, but the process used to produce it was closed. There was no public community behind the production of the code, and no accommodation for such a community.
Our discussion then turned to the ambiguity in the usage of the word "source". Does source mean the computer code written in a recognized programming language, or the process used to produce the code, or something else? If we allow source to mean two different things: (i) the process used to produce the code, and (ii) the computer code, four cases are possible:
- Open process and open computer code
- Closed process and open computer code
- Open process and closed computer code
- Closed process and closed computer code
We surmise that many use the term open source with case 1 in mind. For example, the Eclipse code is licensed under the OSI approved Eclipse Public License (EPL). The central repository for the code base is available and the general public can, with ease, track the pedigree of the code. Release dates are known and published. The general public is encouraged to contribute to the organization itself, to define projects, and to write code. Moreover, the governance structure used to manage all the Eclipse projects is transparent.
Case 2 was outlined above, and it characterizes what our software professional called "open code" but not "open source". A fundamental contradiction seems to exist when an open source asset is developed using a process controlled by a single party. For example, the Open Office project has been criticized. for encouraging a development culture that differs radically from the open-source norm The majority of the contributors to the Open Office project work for Sun Microsystems.
Case 3 includes instances of organizations and individuals that produce a reference implementation of some standard to accelerate that standard's adoption. The code of the reference implementation is typically produced using open processes but the code itself may not be released under an open source licence. The reason is straightforward: releasing the code under an open source licence would potentially weaken the purpose and authority of the standard by facilitating the ease by which deviations of the standard may be introduced.
Case 4 includes instances which are typically referred to as proprietary or closed software. The process used to produce the code is closed, and the software is released using a non-open source licence. This case includes instances where several organizations create a consortium to produce code that only those with membership within the consortium have visibility and access to. Typically, such consortiums have a tiered membership that stipulates the members' rights with respect to the code.
It can also be argued that the use of "source" does not distinguish between the three types of code, any of which could be open or closed. Source can mean: (i) the code used to implement a system or component, (ii) the interface where what we open is the application programming interface (API), or (iii) the data underlying the implementation where most any application or system creates value from the data underlying it.
It can also be argued that source is not limited to computer code. One could extend the four cases described above to combine open and closed processes with hardware schematics or documentation, two examples of assets which are increasingly being considered as open.
What is meant by open?
More recently, we had occasion to find ourselves struggling as we tried to make sense of instances of the word open in the context of community code. For example, we found instances where open meant that releases of the code were made available to the general public (i.e., non-members of a consortium); however, releases to the general public were delayed 12 months from the time it was available to the members of the consortium. We also found instances where what open meant depended upon the level of membership. The more expensive memberships provided these members more privileges to participate in and influence the processes, for example with veto power. In these examples, open is not equated with full access; instead, open is a matter of degree and that degree is metered out in a distinctly defined hierarchy of privilege.
This seeming confusion and differences about what is open and what is source and the use of open source to refer to phenomena that fall well outside the OSD, led us to conclude that we need to better understand the characteristics of the systems in which open source assets are produced, used and distributed.
We conceptualize any such system as being comprised of four components:
- Network: the network of individuals and organizations that produce, use and distribute an asset
- Processes; the processes, approaches, rules and understandings that lead to the production, use and distribution of an asset
- Governance: the governance structure of the organization and the projects within the organization
- Value: value created through collaboration and value appropriated through competition
Metrics of health
We observe that a healthy open source system is required to compete with a strong proprietary system; that is, an open source system cannot compete by virtue of the distribution license alone.
What one would generally agree upon as being an open source system could be expressed in terms of the health of its four components. Such a definition would not be static, but would arguably be more accurate and useful in determining the true value of an open source system. For example, one could speak in terms of a system not having reached the status of being open source until it is deemed to be healthy. And an open source system may subsequently cease to be an open source system if its health deteriorates beyond some point. This line of inquiry could then further leverage health of the four components as a means for distinguishing other types of systems such as closed systems and community systems.
Our initial suggestion as to what is relevant for assessing the health of each of the four components of a system is as follows:
- Network: Large, distributed and diverse: We distinguish between an asset produced by a well developed network from an asset produced by a small number of collocated producers who have similar characteristics. A general reference model for an open source asset would be one that is produced by a well developed network that is able to integrate, test, and quality assure contributions from a large number of diverse individuals and organizations dispersed throughout the world
- Process: Includes meritocracy where one is recognized for the quality of their contributions; transparency in communications and guidelines; recruitment and promotion methods; and mechanisms for dealing with difficult people
- Governance: Includes participation; relationship between contribution and the influence that can be asserted; membership's influence over a project, influence over the overall system governance, and ability to alter the governance structure
- Value creation and appropriation: Usefulness of the asset; how free-riders are addressed--if it is too easy to appropriate value no one would pay for a membership or undergo an apprenticeship to move from being a developer/contributor who writes code or documentation to a committer with write access to the codebase; access to the asset by virtue of the license
Various other metrics can also be used. What is needed is agreement on the key ones.
Conclusion
While the OSD is useful in promoting a brand and defining the rules licenses must adhere to in order to be considered open source, there is much value to conceptualize open source as part of a larger system which describes the production, distribution, and use of an asset. We envisage four components of the system: (i) network, (ii) process, (iii) governance, and (iv) value created and appropriated. Moreover, we suggest that an asset becomes an open source asset when it is produced, used and distributed within a system that is healthy in terms of these four dimensions. We also envisage each component to be multidimensional and identify some of the dimensions that could be used to track system healthiness.
Finally, we argue for a definition of open source which is independent of the basic structure of the asset. We envisage a definition of open source that equally applies to software, hardware schematics, content, and processes.
We invite the readership of the OSBR to embark upon a discussion of the proposed positioning of the open source system and the components and metrics identified.