BioLicense Paper by Jong Bhak

BioLicense: An Experiment in Open-Free Licensing for Biology

Jong Bhak jongbhak@genomics.org

Director, Personal Genomics Institute, Genome Research Foundation, Suwon, South Korea.

Abstract

This paper describes the background and philosophy of BioLicense. BioLicense represents an open and free framework for the exchange of biological data, information, and knowledge among individuals, organizations, and commercial entities. The underlying philosophy of BioLicense posits that all biological entities—including humans, animals, plants, microbes, and even computers—are fundamentally information processors. Consequently, the exchange of information is essential to all existence within the biological universe. It follows that interactions require principles and protocols that facilitate effortless and efficient information exchange. The primary distinction between BioLicense and other open or permissive licenses is its foundational premise: because all life forms act as cooperative computing devices, an environment devoid of informational restriction is the most evolutionarily optimized state. A major goal of BioLicense is to empower scientists to achieve unfettered innovation in the mapping of all biochemical processes.

BioLicense Background

The term license originally means "giving permission" to licensees. However, it is also frequently used by authors and authorities to preclude and control permissions. It is highly questionable how precisely and fairly we can restrict access to diverse data, information, and media within scientific fields.

Recently, biology has experienced two major advancements: the processing of biological information by computers, and the acquisition of large-scale "omics" data by machines, such as DNA sequencers. These breakthroughs are propelling us into a new paradigm where conventional biological information-exchange licenses are no longer sufficient to maximize the efficiency of networked human intelligence. Here, I introduce the concept of an experimental licensing scheme originally developed in 1995.

What is BioLicense?

BioLicense—short for Biological Information Objects License—is a framework representing the completely open and free exchange of biological data, information, and knowledge among information-processing entities, such as scientists.

The core philosophy behind BioLicense is that all biological objects on Earth are information processors. The exchange of information is so fundamental to life that it should be made freely available to anyone pursuing further understanding or practical application. In practice, BioLicense proposes that all generated biological information should be distributed and shared as rapidly as possible, with minimal barriers, to accelerate the discovery of molecular and cellular mechanisms.

The major difference between BioLicense and other liberal licensing models (such as Creative Commons) is its philosophical foundation. It views all life forms as cooperative computing devices and asserts that unrestricted information exchange is the most evolutionarily optimized condition. Ultimately, BioLicense aims to reflect the cooperative, networked nature of biological information processing to maximize human innovation.

History of BioLicense

The term was coined to emphasize the necessity of efficient, large-scale data exchange in biology. The MRC Centre in Cambridge, UK, was an early pioneer in computational biology and genomics. Researchers there shared the philosophy that scientific data and knowledge must be "openfreely" accessible to maximize global societal benefit.

These researchers, who were generating massive datasets of biological structures and sequences, utilized computers to organize and analyze these diverse data types—birthing the fields of bioinformatics and computational biology. They recognized that sharing heterogeneous biological data as freely as possible was the catalyst for scientific innovation. Coincidentally, the 1990s saw the global expansion of the internet. As a sudden influx of biological data flooded the web, many bioinformatics researchers embraced a philosophy of open sharing over the exclusive ownership of data and programs. Fused with existing free software models—such as Open Source, Public Domain, Freeware, and Shareware—various biological data and tool-sharing licenses emerged. BioLicense is one such plan.

BioPerl and BioLicense

The first application of BioLicense was the BioPerl project (http://bioperl.net), which also originated from the MRC Centre in Cambridge. BioPerl is an international collaboration used primarily by bioinformaticians. The original BioPerl license—the prototype for BioLicense—was similar to the Perl license, allowing anyone to freely use and modify the source code.

However, there was a radical difference: the original BioPerl license allowed anyone who edited even a single letter of the source code to claim full rights to alter it, sell the code, or even change the associated license. This meant anyone could assume authorship and ownership of the code and the project. This radical approach empowers anyone to take full control of intellectual products to further modify and advance them. In practice, BioLicense was developed to actively nullify any restrictive licensing applied to BioPerl.

BioLicense Follows Natural Information Acquisition

The philosophy driving BioLicense is that information processors naturally acquire and utilize information without requiring permission from the original source. For example, if a person hears a news broadcast, they may apply that information to their life without seeking permission from the broadcasting company.

Furthermore, this concept is based on the idea that biological information exchange differs fundamentally from engineering fields. Most biological knowledge is derived directly from nature—our genomes and brains are part of it. Biological knowledge and intellectual "properties" are built upon discoveries of systems that have existed since the origin of life, rather than being artificial inventions created by individuals.

Encouraging Literature Copying and Sharing

In the mid-1990s, the concept of open-access literature sharing gained widespread popularity, leading to the creation of open-access commercial publishers like BioMed Central. An extension of BioLicense is the mandate that all literature generated through the "publishing" process must be 100% free for public use. Publishing inherently makes ideas, knowledge, and information public; therefore, anyone with the means should have access to it.

In this regard, BioLicense revives historical traditions of book copying. In the past, if a person was literate and resourceful enough to hand-copy a book, they effectively owned that copy. This practice was so ingrained that copiers assumed ownership of their new manuscript without paying royalties to the original author. Frequently, this copying process introduced valuable additions, improvements, and corrections. This cumulative improvement is evident in the history of cartography, where increasingly accurate maps were built directly upon older ones. If cartographers had been forced to pay royalties for outdated maps, the modern expansion of geographic knowledge would have been stifled. BioLicense actively encourages this tradition of copying, distributing, and iterating.

BioLicense for Scientific Data, Databases, Knowledge, and Media

As vast amounts of data were generated, researchers recognized that traditional scientific literature would eventually merge with large-scale scientific data (such as genome sequences), highlighting the critical importance of shared databases.

The value of scientific data has risen due to the constant integration of heterogeneous databases, which has led to the creation of intellectual properties in the traditional sense. While intellectual property can be protected with great effort, monopolizing large-scale, nature-derived biological data is highly inefficient for future research. From an information-content perspective, the only difference between raw data and polished research results is the number of processing steps applied. There is no absolute, objective distinction between basic data and value-added information like patents, designs, logos, and diverse media.

BioLicense, in principle, does not distinguish between data, databases, information, and knowledge. It can be applied universally to raw data, published manuscripts, software programs, designs, logos, images, video, and other media. To manage these diverse formats, specific subtypes are appended to the BioLicense name. A prime example is BioLicense Genome Data.

BioLicense Genome Data 1.0

Genomic and genetic data, including any personal information, may be shared and re-distributed under the following conditions:

#
The donors of the genomic data must have explicitly agreed that their genetic and personal information will be completely open, free, and redistributable for any purpose (including academic research and commercial product development).

#
No claims regarding the ownership or privacy of the data may be made by anyone who donates, processes, modifies, or re-distributes it.

#
When data formats are altered, the changes must be documented and made public with appropriate version numbers to accurately identify different sets of genome data.

BioLicense Rejects the Concept of "Originators"

BioLicense does not formally acknowledge original founders, creators, or originators. The framework posits that all information objects are derived from pre-existing objects; human intervention merely provides different methods of combination and synthesis.

This concept is known as the BioOriginality Principle. Under this principle, a BioLicense user who substantially modifies a document is permitted to replace the original author's name with their own. Because modern knowledge-network technologies (such as the internet and wiki systems) possess robust version-tracking capabilities, mapping the developmental history of a document is trivial. Therefore, tracking the evolution of knowledge is preserved without needing to grant exclusive ownership to an original creator.

Subtypes of BioLicense

1. BioAcademic License

The BioAcademic License is essentially an agreement of "honest and honorable scientific research" and a declaration that all R&D results are intended for unlimited sharing. It operates similarly to CopyLeft and "CopyTheft" (a neologism based on open-access concepts from the mid-1980s, though not directly derived from them). This is the foundational license for many BioXXX projects, allowing any non-commercial entity to copy, modify, and redistribute source codes, data, ideas, and knowledge. However, it does not automatically grant commercial entities the right to use this data for profit; commercial usage requires direct negotiation with the academic institutes involved.

2. BioCommercial License

This license allows companies to utilize shared information without being forced to reveal their proprietary source code or attach restrictive open-source terms. It is highly flexible compared to traditional CopyLeft schemes. Although titled "BioCommercial," it is an entirely open-sharing license. It simply provides companies a legally unencumbered pathway to generate revenue using BioLicense-based intellectual properties. In practical terms, it functions much like the public domain, allowing companies to claim specific contributions to a development process to justify their profit-making.

3. BioFreedom License

The BioFreedom License is a mutually binding agreement—an aggressive form of "CopyTheft." Once an entity adopts a BioFreedom license, all of their previous, present, and future copyrighted materials and intellectual properties become perpetually free to the licensor. It fundamentally dismantles conventional copyright. It acts as an unbreakable, perpetual union between the licensee and the licensor, merging their intellectual estates.

Summary

BioLicense is an experimental licensing framework designed to maximize information exchange by dismantling conventional intellectual property protections. Its practical objective is to provide unlimited data, information, and knowledge to every information-processing entity in the world, fostering an environment optimized for unfettered evolutionary and scientific innovation.

References

#
Callahan, Michael E. "The History of Shareware." Paul's Picks. Archived from the original on 2008-02-02. Retrieved 2008-05-13.