enpi_api.examples.import_collections
Import clone collections
Examples on how to upload new clone collections from the uploaded files. Clone collections, which contain clones and sequences inside of them, are the primarly input for most of the applications and modules available in the ENPICOM-Platform. It's worth noting that the input files for clone collection must follow a set of rules and have to contain a number of mandatory columns. See documentation here for more details on this behavior.
Simple clone collection import
This example showcases the simples possible clone collection import configuration.
from enpi_api.l2.client.enpi_api_client import EnpiApiClient
"""This example assumes that the file referenced below exists and that its content matches the following structure:
Name,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
...
"""
import_file_path = "/data/example/input_file.csv"
with EnpiApiClient() as enpi_client:
"""The simplest version of a new collection import. It will upload the file internally
and then run an import job on that file, which results in a new collection being created"""
new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
import_file_path, # Path to the input CSV file
).wait()
# Metadata of the newly uploaded collection
print(new_collection_metadata)
Import collection with manual reference setting
This example showcases how to manually specify a reference for the imported collection and explains when it's required.
from enpi_api.l2.client.enpi_api_client import EnpiApiClient
"""We are assuming that the file referenced below exists and that its content matches the following structure:
Name,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
...
The reference revision below must also already exist in the ENPICOM Platform as either a
public one or one specific for your organization.
"""
import_file_path = "/data/example/input_file.csv"
reference_name = "ENPICOM Platform (MiLaboratories)"
species = "Homo sapiens"
with EnpiApiClient() as enpi_client:
"""An optional `reference_database_revision` param can be passed to `create_collection_from_csv` and its `_from_df`
variant. This is used to specify a reference for the imported collection"""
reference = enpi_client.reference_database_api.get_revision_by_name(
name=reference_name,
species=species,
)
new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
import_file_path, # Path to the input CSV file
reference_database_revision=reference, # A reference revision for the imported collection
).wait()
print(new_collection_metadata)
"""If `reference_database_revision` parameter is not provided, import will check the `Organism` value present in
the first line of the imported CSV file and based on its value will query the available references. If there is only one
available, it will be picked for this collection and the import will continue. However, if there are more than one references
available for a given organism, or there is none, import will fail. In such case a manual selection of reference is required,
which was showcases above.
new_collection_metadata = enpi_client.collection_api.create_collection_from_csv(
import_file_path, # Path to the input CSV file
# No `reference_database_revision` specified
).wait()
In practice, there is no downside to manual specification of the reference for all imported collections with
`reference_database_revision` param - on contrary, it can prevent issues in the future if more references become
available, which could cause the multiple-references-available error.
"""
Import clone collection with additional utils
This example showcases additional utilities that can be used for the collection import: metadata, column mapping, skiprows.
from enpi_api.l2.client.enpi_api_client import EnpiApiClient
from enpi_api.l2.tags import CollectionTags
"""This example assumes that the file referenced below exists and that its content matches the following structure:
CollectionTitle,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true
Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false
...
"""
import_file_path = "/data/example/input_file.csv"
mapping = {
"CollectionTitle": CollectionTags.Name, # `CollectionTitle` column from the file
# will be mapped to `CollectionTags.Name`
}
metadata = {
CollectionTags.ProjectId: "Project 001" # Project ID, this will be appended as a Collection tag
}
# First row from the CSV file will be skipped
skiprows = 1
with EnpiApiClient() as enpi_client:
collection_metadata = enpi_client.collection_api.create_collection_from_csv(
import_file_path,
skiprows=skiprows,
mapping=mapping,
metadata=metadata,
).wait()
print(collection_metadata)
1''' 2# Import clone collections 3 4Examples on how to upload new clone collections from the uploaded files. 5Clone collections, which contain clones and sequences inside of them, are the primarly input for most of the applications and modules available in the ENPICOM-Platform. 6It's worth noting that the input files for clone collection must follow a set of rules and have to contain a number of 7mandatory columns. See [documentation here](https://igx.bio/public-api/v1/public/docs/python_sdk/enpi_api/l2/client/api/collection_api.html#CollectionApi.create_collection_from_csv) 8for more details on this behavior. 9 10 11##Simple clone collection import 12 13This example showcases the simples possible clone collection import configuration. 14 15```python 16from enpi_api.l2.client.enpi_api_client import EnpiApiClient 17 18"""This example assumes that the file referenced below exists and that its content matches the following structure: 19Name,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive 20Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true 21Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false 22... 23 24""" 25import_file_path = "/data/example/input_file.csv" 26 27 28with EnpiApiClient() as enpi_client: 29 """The simplest version of a new collection import. It will upload the file internally 30 and then run an import job on that file, which results in a new collection being created""" 31 32 new_collection_metadata = enpi_client.collection_api.create_collection_from_csv( 33 import_file_path, # Path to the input CSV file 34 ).wait() 35 36 # Metadata of the newly uploaded collection 37 print(new_collection_metadata) 38 39``` 40##Import collection with manual reference setting 41 42This example showcases how to manually specify a reference for the imported collection and explains when it's required. 43 44```python 45from enpi_api.l2.client.enpi_api_client import EnpiApiClient 46 47"""We are assuming that the file referenced below exists and that its content matches the following structure: 48 49Name,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive 50Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true 51Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false 52... 53 54The reference revision below must also already exist in the ENPICOM Platform as either a 55public one or one specific for your organization. 56""" 57 58import_file_path = "/data/example/input_file.csv" 59reference_name = "ENPICOM Platform (MiLaboratories)" 60species = "Homo sapiens" 61 62 63with EnpiApiClient() as enpi_client: 64 """An optional `reference_database_revision` param can be passed to `create_collection_from_csv` and its `_from_df` 65 variant. This is used to specify a reference for the imported collection""" 66 reference = enpi_client.reference_database_api.get_revision_by_name( 67 name=reference_name, 68 species=species, 69 ) 70 71 new_collection_metadata = enpi_client.collection_api.create_collection_from_csv( 72 import_file_path, # Path to the input CSV file 73 reference_database_revision=reference, # A reference revision for the imported collection 74 ).wait() 75 76 print(new_collection_metadata) 77 78 """If `reference_database_revision` parameter is not provided, import will check the `Organism` value present in 79 the first line of the imported CSV file and based on its value will query the available references. If there is only one 80 available, it will be picked for this collection and the import will continue. However, if there are more than one references 81 available for a given organism, or there is none, import will fail. In such case a manual selection of reference is required, 82 which was showcases above. 83 84 new_collection_metadata = enpi_client.collection_api.create_collection_from_csv( 85 import_file_path, # Path to the input CSV file 86 # No `reference_database_revision` specified 87 ).wait() 88 89 In practice, there is no downside to manual specification of the reference for all imported collections with 90 `reference_database_revision` param - on contrary, it can prevent issues in the future if more references become 91 available, which could cause the multiple-references-available error. 92 """ 93 94``` 95##Import clone collection with additional utils 96 97This example showcases additional utilities that can be used for the collection import: metadata, column mapping, skiprows. 98 99```python 100from enpi_api.l2.client.enpi_api_client import EnpiApiClient 101from enpi_api.l2.tags import CollectionTags 102 103"""This example assumes that the file referenced below exists and that its content matches the following structure: 104CollectionTitle,Organism,Sequence Count,Full Sequence Nucleotides,Receptor Nucleotides,CDR3 Nucleotides,CDR3 Amino Acids,V Call,J Call,Productive 105Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19*13,IGHJ4*21,true 106Example collection,Homo sapiens,1,ATCG,ATCG,ATCG,ATCG,IGHV3-19,IGHJ4,false 107... 108 109""" 110import_file_path = "/data/example/input_file.csv" 111 112 113mapping = { 114 "CollectionTitle": CollectionTags.Name, # `CollectionTitle` column from the file 115 # will be mapped to `CollectionTags.Name` 116} 117 118metadata = { 119 CollectionTags.ProjectId: "Project 001" # Project ID, this will be appended as a Collection tag 120} 121 122# First row from the CSV file will be skipped 123skiprows = 1 124 125with EnpiApiClient() as enpi_client: 126 collection_metadata = enpi_client.collection_api.create_collection_from_csv( 127 import_file_path, 128 skiprows=skiprows, 129 mapping=mapping, 130 metadata=metadata, 131 ).wait() 132 133 print(collection_metadata) 134 135``` 136'''