enpi_api.examples.apps.enrichment
Enrichment
Enrichment helps you discover a diverse set of highly enriched candidates from display screening. Join, intersect, and subtract datasets to analyze any panning campaign design, no matter the complexity.
Example Enrichment run configuration
This example accepts four input clone collections, puts them through Cluster in order to get clustered clone data, then configures and creates a Enrichment template, runs Enrichment with it and finally exports one of the Enrichment operation's results into a Data Frame. Example showcases all three kinds of operations available in Enrichment as well as a Fold Change annotation option.
from enpi_api.l2.client.enpi_api_client import EnpiApiClient
from enpi_api.l2.tags import SequenceTags
from enpi_api.l2.types.cluster import SequenceFeatureIdentities
from enpi_api.l2.types.collection import CollectionId
from enpi_api.l2.types.enrichment import (
CollectionByIdSelector,
EnrichmentExportMode,
EnrichmentTemplateDifferenceInputs,
EnrichmentTemplateDifferenceOperation,
EnrichmentTemplateFoldChangeAnnotation,
EnrichmentTemplateIntersectionOperation,
EnrichmentTemplateUnionOperation,
FoldChangeInput,
FoldChangeInputs,
IntersectionOperationInput,
SetOperationSelector,
UnionOperationInput,
)
with EnpiApiClient() as enpi_client:
# For this example script we will need four collections in total. Fill values in following two lists with IDs
# of the collections you would like to use. You can also use `enpi_client.collection_api.get_collections_metadata` to
# see all the available collections.
target_collections_ids = [CollectionId(123), CollectionId(124)]
counter_collections_ids = [CollectionId(125), CollectionId(126)]
# Before running Enrichment, input collections need to go through Clustering: all available operations
# are performed on clusters of clones, not clones themselves. Result of the cluster run will serve as a part of
# Enrichment run input configuration.
cluster_run = enpi_client.cluster_api.start(
name="Collections clustered for Enrichment run",
collection_ids=target_collections_ids + counter_collections_ids, # All four collections need to be clustered
sequence_features=[SequenceTags.Cdr3AminoAcids],
identities=SequenceFeatureIdentities(Heavy=80),
match_tags=[SequenceTags.Cdr3AminoAcidsLength],
).wait()
# Next, an Enrichment template needs to be created before a computation can be ran.
# Templates are reusable sets of instructions on how to handle the input data and what operations
# are performed on it. Each Enrichment run is linked to a single template, but a single template
# can be linked to multiple Enrichment runs.
template = enpi_client.enrichment_api.create_template(
name="Example Enrichment template",
# Operations are performed on the input data sets. Later, when this template will be used to run an analysis,
# all the following operations will need to be matched on names and then filled with either collections or results of previous operations.
operations=[
# An intersection operation will be ran
EnrichmentTemplateIntersectionOperation(
name="Target", # It is important to make sure that node names within a single Enrichment template are unique in order to perform matching later on.
),
# An union operation will be ran
EnrichmentTemplateUnionOperation(
name="Counter",
),
# A difference operation will be ran. Here we specify the inputs already, that being
# the results of the two operations specified above - this means that we will first perform
# the "Counter" and "Target" operations and then run "Result" operation on the results of the first two.
EnrichmentTemplateDifferenceOperation(
name="Result",
inputs=EnrichmentTemplateDifferenceInputs(
remove_input=SetOperationSelector(value="Counter"),
from_input=SetOperationSelector(value="Target"),
),
# Annotations are a way to compute the fold change measurement between two entities after an operation is performed.
# Those entities can be either input collections or results of previous operations.
annotations=[
EnrichmentTemplateFoldChangeAnnotation(name="Target FC"),
],
),
],
)
# After an Enrichment template is created, we can use it together with clustered collections data in order to run Enrichment.
enrichment_run = enpi_client.enrichment_api.start(
name="Example Enrichment Run",
enrichment_template=template, # Here we specify our Enrichment template
cluster_run_id=cluster_run.id, # Here we provide the clustered input data
inputs=[
# Now we need populate the inputs specified in selected Enrichment template by matching the nodes on names
# and providing collection IDs for them.
IntersectionOperationInput(
name="Target",
inputs=[CollectionByIdSelector(value=id) for id in target_collections_ids],
),
UnionOperationInput(
name="Counter",
inputs=[CollectionByIdSelector(value=id) for id in counter_collections_ids],
),
# We do not need to specify inputs for the "Result" operation because we already specified "Target" and "Counter" operation
# results as its inputs in the Enrichment template.
FoldChangeInput(
name="Target FC", # Within the Enrichment template, this Fold Change annotation was added on the "Result" operation.
operation_name="Result",
inputs=FoldChangeInputs(
# We would like to get the fold change ratio after "Result" operation is done...
from_input=CollectionByIdSelector(
value=target_collections_ids[0], # ... from the first target collection...
),
to_input=CollectionByIdSelector(
value=target_collections_ids[1], # ... to the second target collection.
),
),
),
],
).wait()
# After the Enrichment computation has finished successfully, we can export results of this run.
df = enpi_client.enrichment_api.export_as_df(
enrichment_run_id=enrichment_run.id, # The ID of the Enrichment analysis that ran above
operation="Result", # Here we specify the result of which operation we want to export
mode=EnrichmentExportMode.REPRESENTATIVES, # Mode in which the export will be ran
tag_ids=[SequenceTags.Cdr3AminoAcids, SequenceTags.Chain], # Tags to be exported
limit=100, # Maximum nuber of clusters to be exported
).wait()
print(df)
1''' 2# Enrichment 3 4Enrichment helps you discover a diverse set of highly enriched candidates from display screening. Join, intersect, and subtract datasets to 5analyze any panning campaign design, no matter the complexity. 6 7##Example Enrichment run configuration 8 9This example accepts four input clone collections, puts them through Cluster 10in order to get clustered clone data, then configures and creates a Enrichment template, runs Enrichment with it and finally exports one of the Enrichment operation's results into 11a Data Frame. Example showcases all three kinds of operations available in Enrichment as well as a Fold Change annotation option. 12```python 13from enpi_api.l2.client.enpi_api_client import EnpiApiClient 14from enpi_api.l2.tags import SequenceTags 15from enpi_api.l2.types.cluster import SequenceFeatureIdentities 16from enpi_api.l2.types.collection import CollectionId 17from enpi_api.l2.types.enrichment import ( 18 CollectionByIdSelector, 19 EnrichmentExportMode, 20 EnrichmentTemplateDifferenceInputs, 21 EnrichmentTemplateDifferenceOperation, 22 EnrichmentTemplateFoldChangeAnnotation, 23 EnrichmentTemplateIntersectionOperation, 24 EnrichmentTemplateUnionOperation, 25 FoldChangeInput, 26 FoldChangeInputs, 27 IntersectionOperationInput, 28 SetOperationSelector, 29 UnionOperationInput, 30) 31 32with EnpiApiClient() as enpi_client: 33 # For this example script we will need four collections in total. Fill values in following two lists with IDs 34 # of the collections you would like to use. You can also use `enpi_client.collection_api.get_collections_metadata` to 35 # see all the available collections. 36 target_collections_ids = [CollectionId(123), CollectionId(124)] 37 counter_collections_ids = [CollectionId(125), CollectionId(126)] 38 39 # Before running Enrichment, input collections need to go through Clustering: all available operations 40 # are performed on clusters of clones, not clones themselves. Result of the cluster run will serve as a part of 41 # Enrichment run input configuration. 42 cluster_run = enpi_client.cluster_api.start( 43 name="Collections clustered for Enrichment run", 44 collection_ids=target_collections_ids + counter_collections_ids, # All four collections need to be clustered 45 sequence_features=[SequenceTags.Cdr3AminoAcids], 46 identities=SequenceFeatureIdentities(Heavy=80), 47 match_tags=[SequenceTags.Cdr3AminoAcidsLength], 48 ).wait() 49 50 # Next, an Enrichment template needs to be created before a computation can be ran. 51 # Templates are reusable sets of instructions on how to handle the input data and what operations 52 # are performed on it. Each Enrichment run is linked to a single template, but a single template 53 # can be linked to multiple Enrichment runs. 54 template = enpi_client.enrichment_api.create_template( 55 name="Example Enrichment template", 56 # Operations are performed on the input data sets. Later, when this template will be used to run an analysis, 57 # all the following operations will need to be matched on names and then filled with either collections or results of previous operations. 58 operations=[ 59 # An intersection operation will be ran 60 EnrichmentTemplateIntersectionOperation( 61 name="Target", # It is important to make sure that node names within a single Enrichment template are unique in order to perform matching later on. 62 ), 63 # An union operation will be ran 64 EnrichmentTemplateUnionOperation( 65 name="Counter", 66 ), 67 # A difference operation will be ran. Here we specify the inputs already, that being 68 # the results of the two operations specified above - this means that we will first perform 69 # the "Counter" and "Target" operations and then run "Result" operation on the results of the first two. 70 EnrichmentTemplateDifferenceOperation( 71 name="Result", 72 inputs=EnrichmentTemplateDifferenceInputs( 73 remove_input=SetOperationSelector(value="Counter"), 74 from_input=SetOperationSelector(value="Target"), 75 ), 76 # Annotations are a way to compute the fold change measurement between two entities after an operation is performed. 77 # Those entities can be either input collections or results of previous operations. 78 annotations=[ 79 EnrichmentTemplateFoldChangeAnnotation(name="Target FC"), 80 ], 81 ), 82 ], 83 ) 84 85 # After an Enrichment template is created, we can use it together with clustered collections data in order to run Enrichment. 86 enrichment_run = enpi_client.enrichment_api.start( 87 name="Example Enrichment Run", 88 enrichment_template=template, # Here we specify our Enrichment template 89 cluster_run_id=cluster_run.id, # Here we provide the clustered input data 90 inputs=[ 91 # Now we need populate the inputs specified in selected Enrichment template by matching the nodes on names 92 # and providing collection IDs for them. 93 IntersectionOperationInput( 94 name="Target", 95 inputs=[CollectionByIdSelector(value=id) for id in target_collections_ids], 96 ), 97 UnionOperationInput( 98 name="Counter", 99 inputs=[CollectionByIdSelector(value=id) for id in counter_collections_ids], 100 ), 101 # We do not need to specify inputs for the "Result" operation because we already specified "Target" and "Counter" operation 102 # results as its inputs in the Enrichment template. 103 FoldChangeInput( 104 name="Target FC", # Within the Enrichment template, this Fold Change annotation was added on the "Result" operation. 105 operation_name="Result", 106 inputs=FoldChangeInputs( 107 # We would like to get the fold change ratio after "Result" operation is done... 108 from_input=CollectionByIdSelector( 109 value=target_collections_ids[0], # ... from the first target collection... 110 ), 111 to_input=CollectionByIdSelector( 112 value=target_collections_ids[1], # ... to the second target collection. 113 ), 114 ), 115 ), 116 ], 117 ).wait() 118 119 # After the Enrichment computation has finished successfully, we can export results of this run. 120 df = enpi_client.enrichment_api.export_as_df( 121 enrichment_run_id=enrichment_run.id, # The ID of the Enrichment analysis that ran above 122 operation="Result", # Here we specify the result of which operation we want to export 123 mode=EnrichmentExportMode.REPRESENTATIVES, # Mode in which the export will be ran 124 tag_ids=[SequenceTags.Cdr3AminoAcids, SequenceTags.Chain], # Tags to be exported 125 limit=100, # Maximum nuber of clusters to be exported 126 ).wait() 127 128 print(df) 129 130``` 131'''