bionty.Bionty#

class bionty.Bionty(source, version=None, species=None, *, reference_id=None, synonyms_field=None, include_id_prefixes=None, include_name_prefixes=None, exclude_id_prefixes=None, exclude_name_prefixes=None, **kwargs)#

Bases: object

Biological entity as an Bionty.

See Guide for background.

Attributes

ontology#

The Pronto Ontology object.

See: https://pronto.readthedocs.io/en/stable/api/pronto.Ontology.html

source#: Name of the source.

species#: The name of Species Bionty.

version#: The name of version entity Bionty.

Methods

curate(df, column=None, reference_id=None, case_sensitive=True)#

Curate index of passed DataFrame to conform with default identifier.

If target_column is None, checks the existing index for compliance with the default identifier.
If target_column denotes an entity identifier, tries to map that identifier to the default identifier.

Parameters:

df – The input Pandas DataFrame to curate.
column – The column in the passed Pandas DataFrame to curate.
reference_id – The reference column in the ontology Pandas DataFrame. ‘Defaults to ontology_id’.
case_sensitive – Whether the curation should be case sensitive or not. Defaults to True.

Return type:

DataFrame

Returns:

Returns the DataFrame with the curated index and a boolean __curated__ column that indicates compliance with the default identifier.

df()#

Pandas DataFrame.

Return type:: DataFrame

fuzzy_match(string, reference_id, synonyms_field='synonyms', case_sensitive=True, return_ranked_results=False)#

Fuzzy matching of a given string using RapidFuzz.

Parameters:

string – an input string
reference_id – The BiontyField of ontology the input string is matching against
synonyms_field – Also map against in the synonyms (If None, no mapping against synonyms)
case_sensitive – Whether the match is case sensitive
return_ranked_results – Whether to return all entries ranked by matching ratios

Returns:

best match of the input string

inspect(identifiers, reference_id, return_df=False)#

Inspect if a list of identifiers are mappable to the entity reference.

Parameters:

identifiers – Identifiers that will be checked against the Ontology.
reference_id – The BiontyField of the ontology to compare against. Examples are ‘ontology_id’ to map against the ontology ID or ‘name’ to map against the ontologies field names.
return_df – Whether to return a Pandas DataFrame.

Return type:

Union[DataFrame, dict[str, list[str]]]

Returns:

A Dictionary that maps the input ontology (keys) to the ontology field (values)
If specified A Pandas DataFrame with the curated index and a boolean __curated__ column that indicates compliance with the default identifier.

lookup(field='name')#

Return an auto-complete object for the bionty id.

Parameters:: field – The field to lookup the values for. Adapt this parameter to, for example, ‘ontology_id’ to lookup by ID. Defaults to ‘name’.
Return type:: tuple
Returns:: A NamedTuple of lookup information of the entitys values.

map_synonyms(identifiers, reference_id, *, synonyms_field='synonyms', return_mapper=False)#

Maps input identifiers against Ontology synonyms.

Parameters:

identifiers – Identifiers that will be mapped against an Ontology field (BiontyField).
reference_id – The BiontyField of ontology representing the identifiers.
return_mapper – Whether to return a dictionary of {identifiers : <mapped reference_id values>}.

Return type:

Union[Dict[str, str], List[str]]

Returns:

A list of mapped reference_id values if return_mapper is False.
A dictionary of mapped values with mappable identifiers as keys and values mapped to reference_id as values if return_mapper is True.