bionty.Bionty#

class bionty.Bionty(source, version=None, species=None, *, reference_id=None, synonyms_field=None, include_id_prefixes=None, include_name_prefixes=None, exclude_id_prefixes=None, exclude_name_prefixes=None, **kwargs)#

Bases: object

Biological entity as an Bionty.

See Guide for background.

Attributes

ontology#

The Pronto Ontology object.

See: https://pronto.readthedocs.io/en/stable/api/pronto.Ontology.html

source#

Name of the source.

species#

The name of Species Bionty.

version#

The name of version entity Bionty.

Methods

curate(df, column=None, reference_id=None, case_sensitive=True)#

Curate index of passed DataFrame to conform with default identifier.

  • If target_column is None, checks the existing index for compliance with the default identifier.

  • If target_column denotes an entity identifier, tries to map that identifier to the default identifier.

Parameters:
  • df – The input Pandas DataFrame to curate.

  • column – The column in the passed Pandas DataFrame to curate.

  • reference_id – The reference column in the ontology Pandas DataFrame. ‘Defaults to ontology_id’.

  • case_sensitive – Whether the curation should be case sensitive or not. Defaults to True.

Return type:

DataFrame

Returns:

Returns the DataFrame with the curated index and a boolean __curated__ column that indicates compliance with the default identifier.

df()#

Pandas DataFrame.

Return type:

DataFrame

fuzzy_match(string, reference_id, synonyms_field='synonyms', case_sensitive=True, return_ranked_results=False)#

Fuzzy matching of a given string using RapidFuzz.

Parameters:
  • string – an input string

  • reference_id – The BiontyField of ontology the input string is matching against

  • synonyms_field – Also map against in the synonyms (If None, no mapping against synonyms)

  • case_sensitive – Whether the match is case sensitive

  • return_ranked_results – Whether to return all entries ranked by matching ratios

Returns:

best match of the input string

inspect(identifiers, reference_id, return_df=False)#

Inspect if a list of identifiers are mappable to the entity reference.

Parameters:
  • identifiers – Identifiers that will be checked against the Ontology.

  • reference_id – The BiontyField of the ontology to compare against. Examples are ‘ontology_id’ to map against the ontology ID or ‘name’ to map against the ontologies field names.

  • return_df – Whether to return a Pandas DataFrame.

Return type:

Union[DataFrame, dict[str, list[str]]]

Returns:

  • A Dictionary that maps the input ontology (keys) to the ontology field (values)

  • If specified A Pandas DataFrame with the curated index and a boolean __curated__ column that indicates compliance with the default identifier.

lookup(field='name')#

Return an auto-complete object for the bionty id.

Parameters:

field – The field to lookup the values for. Adapt this parameter to, for example, ‘ontology_id’ to lookup by ID. Defaults to ‘name’.

Return type:

tuple

Returns:

A NamedTuple of lookup information of the entitys values.

map_synonyms(identifiers, reference_id, *, synonyms_field='synonyms', return_mapper=False)#

Maps input identifiers against Ontology synonyms.

Parameters:
  • identifiers – Identifiers that will be mapped against an Ontology field (BiontyField).

  • reference_id – The BiontyField of ontology representing the identifiers.

  • return_mapper – Whether to return a dictionary of {identifiers : <mapped reference_id values>}.

Return type:

Union[Dict[str, str], List[str]]

Returns:

  • A list of mapped reference_id values if return_mapper is False.

  • A dictionary of mapped values with mappable identifiers as keys and values mapped to reference_id as values if return_mapper is True.