graphein.grn#
Config#
- class graphein.grn.config.GRNGraphConfig(*, kwargs: Dict[str, Union[str, int, float]] = {}, trrust_config: graphein.grn.config.TRRUSTConfig = TRRUSTConfig(filtering_functions=None, root_dir=None, kwargs=None), regnetwork_config: graphein.grn.config.RegNetworkConfig = RegNetworkConfig(filtering_functions=None, root_dir=None, kwargs=None))[source]#
Config object for gene regulatory network graph construction.
- Parameters
kwargs (Dict[str, Union[str, int, float]], optional) – Keyword args for GRN graph construction
trrust_config (graphein.grn.config.TRRUSTConfig, optional) – Config object specifying parameters for parsing TRRUST. Defaults to default config object.
regnetwork_config (graphein.grn.config.RegNetworkConfig, optional) – Config object specifying parameters for parsing RegNetwork. Defaults to default config object.
- class graphein.grn.config.RegNetworkConfig(*, filtering_functions: List[Callable] = None, root_dir: pathlib.Path = None, kwargs: Dict[str, Union[str, int, float]] = None)[source]#
Config object containing parameters for parsing gene regulatory networks from RegNetwork: http://regnetworkweb.org/.
- Parameters
filtering_functions (List[Callable], optional) – List of functions to apply to the the RegNetwork dataframe prior to graph construction. Defaults to None
- class graphein.grn.config.TRRUSTConfig(*, filtering_functions: List[Callable] = None, root_dir: pathlib.Path = None, kwargs: Dict[str, Union[str, int, float]] = None)[source]#
Config object for parsing gene regulatory networks from TRRUST: https://www.grnpedia.org/trrust/
- Parameters
filtering_functions (List[Callable], optional) – List of functions to apply to the the TRRUST dataframe prior to graph construction. Defaults to None
root_dir (pathlib.Path, optional) – Specifies location of TRRUST dataset (will download to this path if not available). Defaults to None.
Graphs#
- graphein.grn.graphs.compute_grn_graph(gene_list: List[str], edge_construction_funcs: List[Callable], graph_annotation_funcs: Optional[List[Callable]] = None, node_annotation_funcs: Optional[List[Callable]] = None, edge_annotation_funcs: Optional[List[Callable]] = None, config: Optional[graphein.grn.config.GRNGraphConfig] = None) networkx.classes.graph.Graph [source]#
Computes a Gene Regulatory Network Graph from a list of gene IDs
- Parameters
gene_list (List[str]) – List of gene identifiers
edge_construction_funcs (List[Callable]) – List of functions to construct edges with
graph_annotation_funcs (List[Callable], optional) – List of functions functools annotate graph metadata, defaults to None
node_annotation_funcs (List[Callable], optional) – List of functions to annotate node metadata, defaults to None
edge_annotation_funcs (List[Callable], optional) – List of functions to annotate edge metadata, defaults to None
config (graphein.grn.GRNGraphConfig, optional) – Config specifying additional parameters for STRING and BIOGRID, defaults to None
- Returns
nx.Graph of PPI network
- Return type
nx.Graph
- graphein.grn.graphs.parse_kwargs_from_config(config: graphein.grn.config.GRNGraphConfig) graphein.grn.config.GRNGraphConfig [source]#
If configs for specific dataset are provided in the Global GRNGraphConfig, we update the kwargs
- Parameters
config (graphein.grn.GRNGraphConfig) – GRN graph configuration object.
- Returns
config with updated config.kwargs
- Return type
graphein.grn.GRNGraphConfig
Edges#
- graphein.grn.edges.add_interacting_genes(G: networkx.classes.graph.Graph, df: pandas.core.frame.DataFrame, kind: str) networkx.classes.graph.Graph [source]#
Generic function for adding interaction edges to GRNGraph
- Parameters
G (nx.Graph) – GRNGraph to populate with edges
df (pd.DataFrame) – DataFrame containing edgelist
kind (str) – name of interaction type
- Returns
Graph with edges added
- Return type
nx.Graph
- graphein.grn.edges.add_regnetwork_edges(G: networkx.classes.graph.Graph, regnetwork_filtering_funcs: Optional[List[Callable]] = None) networkx.classes.graph.Graph [source]#
Adds edges from RegNetwork to GRNGraph
- Parameters
G – Graph to edges to (populated with gene_id nodes)
kwargs – Additional parameters to pass to RegNetwork
- Returns
nx.Graph GRNGraph with RegNetwork regulatory interactions added as edges
- graphein.grn.edges.add_trrust_edges(G: networkx.classes.graph.Graph, trrust_filtering_funcs: Optional[List[Callable]] = None) networkx.classes.graph.Graph [source]#
Adds edges from TRRUST to GRNGraph
- Parameters
G (nx.Graph) – Graph to edges to (populated with gene_id nodes)
trrust_filtering_funcs (List[Callable], optional) – List of functions to apply to TRRUST dataframe as pre-processing prior to graph constructions. Defaults to None.
- Returns
nx.Graph GRNGraph with TRRUST regulatory interactions added as edges
- Return type
nx.Graph
Features#
Database Parsers#
RegNetwork#
- graphein.grn.parse_regnetwork.RegNetwork_df(gene_list: List[str], root_dir: Optional[pathlib.Path] = None, filtering_funcs: Optional[List[Callable]] = None) pandas.core.frame.DataFrame [source]#
Generates standardised dataframe with RegNetwork protein-protein interactions, filtered according to user’s input :return: Standardised dataframe with RegNetwork interactions
- graphein.grn.parse_regnetwork.filter_RegNetwork(df: pandas.core.frame.DataFrame, funcs: Optional[List[Callable]] = None) pandas.core.frame.DataFrame [source]#
Filters results of RegNetwork call by providing a list of user-defined functions that accept a dataframe and return a dataframe
- Parameters
df – pd.Dataframe to filter
funcs – list of functions that carry out dataframe processing
- Returns
processed dataframe
- graphein.grn.parse_regnetwork.load_RegNetwork_interactions(root_dir: Optional[pathlib.Path] = None) pandas.core.frame.DataFrame [source]#
Loads RegNetwork interaction datafile. Downloads the file first if not already present.
- graphein.grn.parse_regnetwork.load_RegNetwork_regulation_types(root_dir: Optional[pathlib.Path] = None) pandas.core.frame.DataFrame [source]#
Loads RegNetwork regulation types. Downloads the file first if not already present.
- graphein.grn.parse_regnetwork.parse_RegNetwork(gene_list: List[str], root_dir: Optional[pathlib.Path] = None) pandas.core.frame.DataFrame [source]#
Parser for RegNetwork interactions
- Parameters
gene_list – List of gene identifiers
:return Pandas dataframe with the regulatory interactions between genes in the gene list
- graphein.grn.parse_regnetwork.standardise_RegNetwork(df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]#
Standardises STRING dataframe, e.g. puts everything into a common format
- Parameters
df (pd.DataFrame) – Source specific Pandas dataframe
- Returns
Standardised dataframe
- Return type
pd.DataFrame
TRRUST#
Utilities for parsing the TRRUST database.
- graphein.grn.parse_trrust.TRRUST_df(gene_list: List[str], filtering_funcs: Optional[List[Callable]] = None) pandas.core.frame.DataFrame [source]#
Generates standardised dataframe with TRRUST protein-protein interactions, filtered according to user’s input.
- Parameters
gene_list (List[str]) –
filtering_funcs (List[Callable]) – Functions with which to filter the dataframe.
- Returns
Standardised dataframe with TRRUST interactions
- Return type
pd.DataFrame
- graphein.grn.parse_trrust.filter_TRRUST(df: pandas.core.frame.DataFrame, funcs: Optional[List[Callable]]) pandas.core.frame.DataFrame [source]#
Filters results of TRRUST call according to user kwargs.
- Parameters
df (pd.DataFrame) – Source specific Pandas dataframe (TRRUST) with results of the API call
funcs (List[Callable]) – User functions to filter the results.
- Returns
Source specific Pandas dataframe with filtered results
- Return type
pd.DataFrame
- graphein.grn.parse_trrust.load_TRRUST(root_dir: Optional[pathlib.Path] = None) pandas.core.frame.DataFrame [source]#
Loads the TRRUST datafile. If file not found, it is downloaded.
- Parameters
root_dir (pathlib.Path, optional) – Root directory path to either find or download TRRUST
- Returns
TRRUST database as a dataframe
- Return type
pd.DataFrame
- graphein.grn.parse_trrust.parse_TRRUST(gene_list: List[str], root_dir: Optional[pathlib.Path] = None) pandas.core.frame.DataFrame [source]#
Parser for TRRUST regulatory interactions. If the TRRUST dataset is not found in the specified root_dir, it is downloaded
- Parameters
gene_list (List[str]) – List of gene identifiers to restrict dataframe to.
root_dir (pathlib.Path, optional) – Root directory path to either find or download TRRUST. Defaults to None (downloads dataset to graphein/datasets/trrust)
- Returns
Pandas dataframe with the regulatory interactions between genes in the gene list
- Return type
pd.DataFrame
- graphein.grn.parse_trrust.standardise_TRRUST(df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]#
Filters results of TRRUST call by providing a list of user-defined functions that accept a dataframe and return a dataframe.
- Parameters
df (pd.DataFrame) – pd.Dataframe to filter. Must contain columns: [“g1”, “g2”, “regtype”]
funcs (List[Callable]) – list of functions that carry out dataframe processing
- Returns
processed dataframe
- Return type
pd.DataFrame