ENPICOM Logo API Docs Python SDK Docs Events

enpi_api.examples.import_collections

Import clone collections

Examples on how to upload new clone collections from the uploaded files. Clone collections, which contain clones and sequences inside of them, are the primarly input for most of the applications and modules available in the ENPICOM-Platform. It's worth noting that the input files for clone collection must follow a set of rules and have to contain a number of mandatory columns. See documentation here for more details on this behavior.

Simple clone collection import

This example showcases the simples possible clone collection import configuration.

from enpi_api.l2.client.enpi_api_client import EnpiApiClient

"""This example assumes that the file referenced below exists and that its content matches the following structure:
Name,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
...

"""
import_file_path = "/data/example/input_file.csv"


with EnpiApiClient() as enpi_client:
    """The simplest version of a new collection import. It will upload the file internally
    and then run an import job on that file, which results in a new collection being created"""

    new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
        import_file_path,  # Path to the input CSV file
    ).wait()

    # Metadata of the newly uploaded collection
    print(new_collection_metadata)

Import collection with manual reference setting

This example showcases how to manually specify a reference for the imported collection and explains when it's required.

from enpi_api.l2.client.enpi_api_client import EnpiApiClient

"""We are assuming that the file referenced below exists and that its content matches the following structure:

Name,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
...

The reference revision below must also already exist in the ENPICOM Platform as either a
public one or one specific for your organization.
"""

import_file_path = "/data/example/input_file.csv"
reference_name = "ENPICOM Platform (MiLaboratories)"
species = "Homo sapiens"


with EnpiApiClient() as enpi_client:
    """An optional `reference_database_revision` param can be passed to `create_collection_from_csv` and its `_from_df`
    variant. This is used to specify a reference for the imported collection"""
    reference = enpi_client.reference_database_api.get_revision_by_name(
        name=reference_name,
        species=species,
    )

    new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
        import_file_path,  # Path to the input CSV file
        reference_database_revision=reference,  # A reference revision for the imported collection
    ).wait()

    print(new_collection_metadata)

    """If `reference_database_revision` parameter is not provided, import will check the `Organism` value present in
    the first line of the imported CSV file and based on its value will query the available references. If there is only one
    available, it will be picked for this collection and the import will continue. However, if there are more than one references
    available for a given organism, or there is none, import will fail. In such case a manual selection of reference is required,
    which was showcases above.

    new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
        import_file_path,  # Path to the input CSV file
        # No `reference_database_revision` specified
    ).wait()

    In practice, there is no downside to manual specification of the reference for all imported collections with
    `reference_database_revision` param - on contrary, it can prevent issues in the future if more references become
    available, which could cause the multiple-references-available error.
    """

Import clone collection with additional utils

This example showcases additional utilities that can be used for the collection import: metadata, column mapping, skiprows.

from enpi_api.l2.client.enpi_api_client import EnpiApiClient
from enpi_api.l2.tags import CollectionTags

"""This example assumes that the file referenced below exists and that its content matches the following structure:
CollectionTitle,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
...

"""
import_file_path = "/data/example/input_file.csv"


mapping = {
    "CollectionTitle": CollectionTags.Name,  # `CollectionTitle` column from the file
    # will be mapped to `CollectionTags.Name`
}

metadata = {
    CollectionTags.ProjectId: "Project 001"  # Project ID, this will be appended as a Collection tag
}

# First row from the CSV file will be skipped
skiprows = 1

with EnpiApiClient() as enpi_client:
    collection_metadata = enpi_client.collection_api.create_collection_from_csv(
        import_file_path,
        skiprows=skiprows,
        mapping=mapping,
        metadata=metadata,
    ).wait()

    print(collection_metadata)
  1'''
  2# Import clone collections
  3
  4Examples on how to upload new clone collections from the uploaded files.
  5Clone collections, which contain clones and sequences inside of them, are the primarly input for most of the applications and modules available in the ENPICOM-Platform.
  6It's worth noting that the input files for clone collection must follow a set of rules and have to contain a number of 
  7mandatory columns. See [documentation here](https://igx.bio/public-api/v1/public/docs/python_sdk/enpi_api/l2/client/api/collection_api.html#CollectionApi.create_collection_from_csv)
  8for more details on this behavior.
  9
 10
 11##Simple clone collection import
 12
 13This example showcases the simples possible clone collection import configuration.
 14
 15```python
 16from enpi_api.l2.client.enpi_api_client import EnpiApiClient
 17
 18"""This example assumes that the file referenced below exists and that its content matches the following structure:
 19Name,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
 20Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
 21Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
 22...
 23
 24"""
 25import_file_path = "/data/example/input_file.csv"
 26
 27
 28with EnpiApiClient() as enpi_client:
 29    """The simplest version of a new collection import. It will upload the file internally
 30    and then run an import job on that file, which results in a new collection being created"""
 31
 32    new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
 33        import_file_path,  # Path to the input CSV file
 34    ).wait()
 35
 36    # Metadata of the newly uploaded collection
 37    print(new_collection_metadata)
 38
 39```
 40##Import collection with manual reference setting
 41
 42This example showcases how to manually specify a reference for the imported collection and explains when it's required.
 43
 44```python
 45from enpi_api.l2.client.enpi_api_client import EnpiApiClient
 46
 47"""We are assuming that the file referenced below exists and that its content matches the following structure:
 48
 49Name,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
 50Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
 51Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
 52...
 53
 54The reference revision below must also already exist in the ENPICOM Platform as either a
 55public one or one specific for your organization.
 56"""
 57
 58import_file_path = "/data/example/input_file.csv"
 59reference_name = "ENPICOM Platform (MiLaboratories)"
 60species = "Homo sapiens"
 61
 62
 63with EnpiApiClient() as enpi_client:
 64    """An optional `reference_database_revision` param can be passed to `create_collection_from_csv` and its `_from_df`
 65    variant. This is used to specify a reference for the imported collection"""
 66    reference = enpi_client.reference_database_api.get_revision_by_name(
 67        name=reference_name,
 68        species=species,
 69    )
 70
 71    new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
 72        import_file_path,  # Path to the input CSV file
 73        reference_database_revision=reference,  # A reference revision for the imported collection
 74    ).wait()
 75
 76    print(new_collection_metadata)
 77
 78    """If `reference_database_revision` parameter is not provided, import will check the `Organism` value present in
 79    the first line of the imported CSV file and based on its value will query the available references. If there is only one
 80    available, it will be picked for this collection and the import will continue. However, if there are more than one references
 81    available for a given organism, or there is none, import will fail. In such case a manual selection of reference is required,
 82    which was showcases above.
 83
 84    new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
 85        import_file_path,  # Path to the input CSV file
 86        # No `reference_database_revision` specified
 87    ).wait()
 88
 89    In practice, there is no downside to manual specification of the reference for all imported collections with
 90    `reference_database_revision` param - on contrary, it can prevent issues in the future if more references become
 91    available, which could cause the multiple-references-available error.
 92    """
 93
 94```
 95##Import clone collection with additional utils
 96
 97This example showcases additional utilities that can be used for the collection import: metadata, column mapping, skiprows.
 98
 99```python
100from enpi_api.l2.client.enpi_api_client import EnpiApiClient
101from enpi_api.l2.tags import CollectionTags
102
103"""This example assumes that the file referenced below exists and that its content matches the following structure:
104CollectionTitle,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
105Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
106Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
107...
108
109"""
110import_file_path = "/data/example/input_file.csv"
111
112
113mapping = {
114    "CollectionTitle": CollectionTags.Name,  # `CollectionTitle` column from the file
115    # will be mapped to `CollectionTags.Name`
116}
117
118metadata = {
119    CollectionTags.ProjectId: "Project 001"  # Project ID, this will be appended as a Collection tag
120}
121
122# First row from the CSV file will be skipped
123skiprows = 1
124
125with EnpiApiClient() as enpi_client:
126    collection_metadata = enpi_client.collection_api.create_collection_from_csv(
127        import_file_path,
128        skiprows=skiprows,
129        mapping=mapping,
130        metadata=metadata,
131    ).wait()
132
133    print(collection_metadata)
134
135```
136'''