Extend Bionty#
Bionty is designed to be extendable in various ways. This guide covers how to:
Extend existing ontologies such as adding a new cell type to a cell type ontology
Adding new ontologies such as a new disease ontology
Implementing new entities that Bionty does not cover
New terms and records#
If you’d like to add new terms to an existing ontology we recommend using Lamin to enable full knowledge management features.
For use cases where Bionty is run standalone, we kindly ask users to be patience since support for it is currently work in progress.
New ontologies#
The easiest way to add new ontologies to existing entities is to adapt the sources_local.yaml
file in the $home/.lamin/bionty/sources
directory.
For example, to add a new disease ontology (termed “inhouse_diseases”) with an associated version and URL, one adds the following lines to the sources.local.yaml
.
Disease:
mondo:
all:
2023-02-06:
source: http://purl.obolibrary.org/obo/mondo/releases/2023-02-06/mondo.owl
md5: 2b7d479d4bd02a94eab47d1c9e64c5db
2022-10-11:
source: http://purl.obolibrary.org/obo/mondo/releases/2022-10-11/mondo.owl
md5: 04b808d05c2c2e81430b20a0e87552bb
name: Mondo Disease Ontology
website: https://mondo.monarchinitiative.org/
doid:
human:
2023-01-30:
source: http://purl.obolibrary.org/obo/doid/releases/2023-01-30/doid.obo
md5: 9f0c92ad2896dda82195e9226a06dc36
name: Human Disease Ontology
website: https://disease-ontology.org/
inhouse_diseases:
human:
2000-01-01:
source: http://download-my-diseases.com/releases/2000-01-01/mydiseases.owl
md5: "md5 if available or leave out this row"
name: My in-house Disease Ontology
website: http://my-website.com
Use “all” if organism doesn’t apply or unknown.
The md5 sum is optional (leave out if not available) and can be calculated with for example:
import hashlib
from pathlib import Path
def calculate_md5(file_path: Path | str) -> str:
with open(file_path, "rb") as f:
md5 = hashlib.md5()
while True:
data = f.read(8192)
if not data:
break
md5.update(data)
file_md5 = md5.hexdigest()
print(file_md5)
If desired, the new ontology can be set as default. See Configuration for more details.
New entites#
Adding new entities to Bionty requires subclassing the bionty.Bionty
and modifying the sources_local.yaml
file.
The bionty.Bionty
requires several properties to be defined:
organism: str,
source: str,
version: str,
These are automatically populated by either the currently used Bionty sources (see Configuration) or explicitly passed as parameters when initializing an Entity.
Hence, a new Bionty class MyEntity
would be defined as:
from bionty import Bionty
class MyEntity(Bionty):
"""MyEntity."""
def __init__(
self,
organism: Optional[str] = None,
source: Optional[Literal["mydatabase_1", "mydatabase_2"]] = None,
version: Optional[str] = None,
**kwargs
) -> None:
super().__init__(
source=source,
version=version,
organism=organism,
**kwargs
)
The local.yml
would then need to be extended as:
MyEntity:
mysource_1:
human:
2042-01-01:
source: http://my-url/releases/2042-01-01/mysource_1.owl
md5: "md5 if available or leave out this row"
mysource_2:
all:
2042-01-01:
source: http://my-url/releases/2042-01-01/mysource_2.owl
md5: "md5 if available or leave out this row"
Great! Now we’ve added a new Bionty class, which can be used with all Bionty functions! 🎉