enpi_api.examples.apps.enrichment

Enrichment

Enrichment helps you discover a diverse set of highly enriched candidates from display screening. Join, intersect, and subtract datasets to analyze any panning campaign design, no matter the complexity.

Example Enrichment run configuration

This example accepts four input clone collections, puts them through Cluster in order to get clustered clone data, then configures and creates a Enrichment template, runs Enrichment with it and finally exports one of the Enrichment operation's results into a Data Frame. Example showcases all three kinds of operations available in Enrichment as well as a Fold Change annotation option.

from enpi_api.l2.client.enpi_api_client import EnpiApiClient
from enpi_api.l2.tags import SequenceTags
from enpi_api.l2.types.cluster import SequenceFeatureIdentities
from enpi_api.l2.types.collection import CollectionId
from enpi_api.l2.types.enrichment import (
    CollectionByIdSelector,
    EnrichmentExportMode,
    EnrichmentTemplateDifferenceInputs,
    EnrichmentTemplateDifferenceOperation,
    EnrichmentTemplateFoldChangeAnnotation,
    EnrichmentTemplateIntersectionOperation,
    EnrichmentTemplateUnionOperation,
    FoldChangeInput,
    FoldChangeInputs,
    IntersectionOperationInput,
    SetOperationSelector,
    UnionOperationInput,
)

with EnpiApiClient() as enpi_client:
    # For this example script we will need four collections in total. Fill values in following two lists with IDs
    # of the collections you would like to use. You can also use `enpi_client.collection_api.get_collections_metadata` to
    # see all the available collections.
    target_collections_ids = [CollectionId(123), CollectionId(124)]
    counter_collections_ids = [CollectionId(125), CollectionId(126)]

    # Before running Enrichment, input collections need to go through Clustering: all available operations
    # are performed on clusters of clones, not clones themselves. Result of the cluster run will serve as a part of
    # Enrichment run input configuration.
    cluster_run = enpi_client.cluster_api.start(
        name="Collections clustered for Enrichment run",
        collection_ids=target_collections_ids + counter_collections_ids,  # All four collections need to be clustered
        sequence_features=[SequenceTags.Cdr3AminoAcids],
        identities=SequenceFeatureIdentities(Heavy=80),
        match_tags=[SequenceTags.Cdr3AminoAcidsLength],
    ).wait()

    # Next, an Enrichment template needs to be created before a computation can be ran.
    # Templates are reusable sets of instructions on how to handle the input data and what operations
    # are performed on it. Each Enrichment run is linked to a single template, but a single template
    # can be linked to multiple Enrichment runs.
    template = enpi_client.enrichment_api.create_template(
        name="Example Enrichment template",
        # Operations are performed on the input data sets. Later, when this template will be used to run an analysis,
        # all the following operations will need to be matched on names and then filled with either collections or results of previous operations.
        operations=[
            # An intersection operation will be ran
            EnrichmentTemplateIntersectionOperation(
                name="Target",  # It is important to make sure that node names within a single Enrichment template are unique in order to perform matching later on.
            ),
            # An union operation will be ran
            EnrichmentTemplateUnionOperation(
                name="Counter",
            ),
            # A difference operation will be ran. Here we specify the inputs already, that being
            # the results of the two operations specified above - this means that we will first perform
            # the "Counter" and "Target" operations and then run "Result" operation on the results of the first two.
            EnrichmentTemplateDifferenceOperation(
                name="Result",
                inputs=EnrichmentTemplateDifferenceInputs(
                    remove_input=SetOperationSelector(value="Counter"),
                    from_input=SetOperationSelector(value="Target"),
                ),
                # Annotations are a way to compute the fold change measurement between two entities after an operation is performed.
                # Those entities can be either input collections or results of previous operations.
                annotations=[
                    EnrichmentTemplateFoldChangeAnnotation(name="Target FC"),
                ],
            ),
        ],
    )

    # After an Enrichment template is created, we can use it together with clustered collections data in order to run Enrichment.
    enrichment_run = enpi_client.enrichment_api.start(
        name="Example Enrichment Run",
        enrichment_template=template,  # Here we specify our Enrichment template
        cluster_run_id=cluster_run.id,  # Here we provide the clustered input data
        inputs=[
            # Now we need populate the inputs specified in selected Enrichment template by matching the nodes on names
            # and providing collection IDs for them.
            IntersectionOperationInput(
                name="Target",
                inputs=[CollectionByIdSelector(value=id) for id in target_collections_ids],
            ),
            UnionOperationInput(
                name="Counter",
                inputs=[CollectionByIdSelector(value=id) for id in counter_collections_ids],
            ),
            # We do not need to specify inputs for the "Result" operation because we already specified "Target" and "Counter" operation
            # results as its inputs in the Enrichment template.
            FoldChangeInput(
                name="Target FC",  # Within the Enrichment template, this Fold Change annotation was added on the "Result" operation.
                operation_name="Result",
                inputs=FoldChangeInputs(
                    # We would like to get the fold change ratio after "Result" operation is done...
                    from_input=CollectionByIdSelector(
                        value=target_collections_ids[0],  # ... from the first target collection...
                    ),
                    to_input=CollectionByIdSelector(
                        value=target_collections_ids[1],  # ... to the second target collection.
                    ),
                ),
            ),
        ],
    ).wait()

    # After the Enrichment computation has finished successfully, we can export results of this run.
    df = enpi_client.enrichment_api.export_as_df(
        enrichment_run_id=enrichment_run.id,  # The ID of the Enrichment analysis that ran above
        operation="Result",  # Here we specify the result of which operation we want to export
        mode=EnrichmentExportMode.REPRESENTATIVES,  # Mode in which the export will be ran
        tag_ids=[SequenceTags.Cdr3AminoAcids, SequenceTags.Chain],  # Tags to be exported
        limit=100,  # Maximum nuber of clusters to be exported
    ).wait()

    print(df)

View Source

  1'''
  2# Enrichment
  3
  4Enrichment helps you discover a diverse set of highly enriched candidates from display screening. Join, intersect, and subtract datasets to
  5analyze any panning campaign design, no matter the complexity.
  6
  7##Example Enrichment run configuration
  8
  9This example accepts four input clone collections, puts them through Cluster 
 10in order to get clustered clone data, then configures and creates a Enrichment template, runs Enrichment with it and finally exports one of the Enrichment operation's results into 
 11a Data Frame. Example showcases all three kinds of operations available in Enrichment as well as a Fold Change annotation option.
 12```python
 13from enpi_api.l2.client.enpi_api_client import EnpiApiClient
 14from enpi_api.l2.tags import SequenceTags
 15from enpi_api.l2.types.cluster import SequenceFeatureIdentities
 16from enpi_api.l2.types.collection import CollectionId
 17from enpi_api.l2.types.enrichment import (
 18    CollectionByIdSelector,
 19    EnrichmentExportMode,
 20    EnrichmentTemplateDifferenceInputs,
 21    EnrichmentTemplateDifferenceOperation,
 22    EnrichmentTemplateFoldChangeAnnotation,
 23    EnrichmentTemplateIntersectionOperation,
 24    EnrichmentTemplateUnionOperation,
 25    FoldChangeInput,
 26    FoldChangeInputs,
 27    IntersectionOperationInput,
 28    SetOperationSelector,
 29    UnionOperationInput,
 30)
 31
 32with EnpiApiClient() as enpi_client:
 33    # For this example script we will need four collections in total. Fill values in following two lists with IDs
 34    # of the collections you would like to use. You can also use `enpi_client.collection_api.get_collections_metadata` to
 35    # see all the available collections.
 36    target_collections_ids = [CollectionId(123), CollectionId(124)]
 37    counter_collections_ids = [CollectionId(125), CollectionId(126)]
 38
 39    # Before running Enrichment, input collections need to go through Clustering: all available operations
 40    # are performed on clusters of clones, not clones themselves. Result of the cluster run will serve as a part of
 41    # Enrichment run input configuration.
 42    cluster_run = enpi_client.cluster_api.start(
 43        name="Collections clustered for Enrichment run",
 44        collection_ids=target_collections_ids + counter_collections_ids,  # All four collections need to be clustered
 45        sequence_features=[SequenceTags.Cdr3AminoAcids],
 46        identities=SequenceFeatureIdentities(Heavy=80),
 47        match_tags=[SequenceTags.Cdr3AminoAcidsLength],
 48    ).wait()
 49
 50    # Next, an Enrichment template needs to be created before a computation can be ran.
 51    # Templates are reusable sets of instructions on how to handle the input data and what operations
 52    # are performed on it. Each Enrichment run is linked to a single template, but a single template
 53    # can be linked to multiple Enrichment runs.
 54    template = enpi_client.enrichment_api.create_template(
 55        name="Example Enrichment template",
 56        # Operations are performed on the input data sets. Later, when this template will be used to run an analysis,
 57        # all the following operations will need to be matched on names and then filled with either collections or results of previous operations.
 58        operations=[
 59            # An intersection operation will be ran
 60            EnrichmentTemplateIntersectionOperation(
 61                name="Target",  # It is important to make sure that node names within a single Enrichment template are unique in order to perform matching later on.
 62            ),
 63            # An union operation will be ran
 64            EnrichmentTemplateUnionOperation(
 65                name="Counter",
 66            ),
 67            # A difference operation will be ran. Here we specify the inputs already, that being
 68            # the results of the two operations specified above - this means that we will first perform
 69            # the "Counter" and "Target" operations and then run "Result" operation on the results of the first two.
 70            EnrichmentTemplateDifferenceOperation(
 71                name="Result",
 72                inputs=EnrichmentTemplateDifferenceInputs(
 73                    remove_input=SetOperationSelector(value="Counter"),
 74                    from_input=SetOperationSelector(value="Target"),
 75                ),
 76                # Annotations are a way to compute the fold change measurement between two entities after an operation is performed.
 77                # Those entities can be either input collections or results of previous operations.
 78                annotations=[
 79                    EnrichmentTemplateFoldChangeAnnotation(name="Target FC"),
 80                ],
 81            ),
 82        ],
 83    )
 84
 85    # After an Enrichment template is created, we can use it together with clustered collections data in order to run Enrichment.
 86    enrichment_run = enpi_client.enrichment_api.start(
 87        name="Example Enrichment Run",
 88        enrichment_template=template,  # Here we specify our Enrichment template
 89        cluster_run_id=cluster_run.id,  # Here we provide the clustered input data
 90        inputs=[
 91            # Now we need populate the inputs specified in selected Enrichment template by matching the nodes on names
 92            # and providing collection IDs for them.
 93            IntersectionOperationInput(
 94                name="Target",
 95                inputs=[CollectionByIdSelector(value=id) for id in target_collections_ids],
 96            ),
 97            UnionOperationInput(
 98                name="Counter",
 99                inputs=[CollectionByIdSelector(value=id) for id in counter_collections_ids],
100            ),
101            # We do not need to specify inputs for the "Result" operation because we already specified "Target" and "Counter" operation
102            # results as its inputs in the Enrichment template.
103            FoldChangeInput(
104                name="Target FC",  # Within the Enrichment template, this Fold Change annotation was added on the "Result" operation.
105                operation_name="Result",
106                inputs=FoldChangeInputs(
107                    # We would like to get the fold change ratio after "Result" operation is done...
108                    from_input=CollectionByIdSelector(
109                        value=target_collections_ids[0],  # ... from the first target collection...
110                    ),
111                    to_input=CollectionByIdSelector(
112                        value=target_collections_ids[1],  # ... to the second target collection.
113                    ),
114                ),
115            ),
116        ],
117    ).wait()
118
119    # After the Enrichment computation has finished successfully, we can export results of this run.
120    df = enpi_client.enrichment_api.export_as_df(
121        enrichment_run_id=enrichment_run.id,  # The ID of the Enrichment analysis that ran above
122        operation="Result",  # Here we specify the result of which operation we want to export
123        mode=EnrichmentExportMode.REPRESENTATIVES,  # Mode in which the export will be ran
124        tag_ids=[SequenceTags.Cdr3AminoAcids, SequenceTags.Chain],  # Tags to be exported
125        limit=100,  # Maximum nuber of clusters to be exported
126    ).wait()
127
128    print(df)
129
130```
131'''