immunum
immunum.Annotator
Annotates antibody and T-cell receptor sequences with IMGT or Kabat position numbers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chains
|
list[str]
|
Chain types to consider during auto-detection. Each entry is a case-insensitive string. Accepted values:
Pass all chains you want to consider; the annotator scores each and picks the best-matching one. To consider every supported chain pass all seven values. |
required |
scheme
|
str
|
Numbering scheme to use for output positions. Accepted values (case-insensitive):
Note: Kabat is only supported for antibody chains (IGH, IGK, IGL). |
required |
min_confidence
|
float | None
|
Minimum alignment confidence threshold in the range |
None
|
Source code in immunum/__init__.py
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 | |
_annotator
instance-attribute
__init__(chains, scheme, min_confidence=None)
Create an Annotator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chains
|
list[str]
|
Chain types to consider. See class docstring for accepted values. |
required |
scheme
|
str
|
Numbering scheme — |
required |
min_confidence
|
float | None
|
Reject sequences with alignment confidence below this
threshold. Defaults to |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If any chain or scheme value is unrecognised, if Kabat is
requested for TCR chains, or if |
Source code in immunum/__init__.py
number(sequence)
Assign IMGT or Kabat position numbers to every residue in a sequence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
Amino-acid sequence string (single-letter codes). |
required |
Returns:
| Type | Description |
|---|---|
NumberingResult
|
A |
NumberingResult
|
and a |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the sequence is empty or scores below |
Source code in immunum/__init__.py
segment(sequence)
Split a sequence into FR/CDR regions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
Amino-acid sequence string (single-letter codes). |
required |
Returns:
| Type | Description |
|---|---|
SegmenationResult
|
A |
SegmenationResult
|
and any unaligned |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the sequence is empty or scores below |
Source code in immunum/__init__.py
immunum.NumberingResult
dataclass
Python dataclass containing numbering results. Allows for direct attribute access
via result.chain, result.numbering, etc.:
from immunum import Annotator
annotator = Annotator(
chains=["H", "K", "L"],
scheme="imgt",
)
sequence = "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYCAREGTTGKPIGAFAHWGQGTLVTVSS"
result = annotator.number(sequence)
assert result.chain == "H"
assert result.scheme == "IMGT"
assert isinstance(
result.confidence, float
)
assert result.numbering["1"] == "Q"
for (
position,
amino_acid,
) in result.numbering.items():
print(f"{position}: {amino_acid}")
# 1: Q
# 2: V
# 3: Q
# ...
Source code in immunum/__init__.py
chain
instance-attribute
scheme
instance-attribute
confidence
instance-attribute
numbering
instance-attribute
__init__(chain, scheme, confidence, numbering)
immunum.SegmenationResult
dataclass
Python dataclass containing numbering results. Allows for direct atribute access
via results.fr1, and also for iterating through segmentation results via as_dict():
from immunum import Annotator
annotator = Annotator(
chains=["H", "K", "L"],
scheme="imgt",
)
sequence = "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYCAREGTTGKPIGAFAHWGQGTLVTVSS"
result = annotator.segment(sequence)
assert (
result.fr1
== "QVQLVQSGAEVKRPGSSVTVSCKAS"
)
assert result.cdr1 == "GGSFSTYA"
assert result.fr2 == "LSWVRQAPGRGLEWMGG"
assert result.cdr2 == "VIPLLTIT"
assert (
result.fr3
== "NYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYC"
)
assert result.cdr3 == "AREGTTGKPIGAFAH"
assert result.fr4 == "WGQGTLVTVSS"
for (
segment,
aminoacids,
) in result.as_dict().items():
print(f"{segment}: {aminoacids}")
# fr1: QVQLVQSGAEVKRPGSSVTVSCKAS
# cdr1: GGSFSTYA
# fr2: LSWVRQAPGRGLEWMGG
# cdr2: VIPLLTIT
# fr3: NYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYC
# cdr3: AREGTTGKPIGAFAH
# fr4: WGQGTLVTVSS
# prefix:
# postfix:
Source code in immunum/__init__.py
fr1
instance-attribute
cdr1
instance-attribute
fr2
instance-attribute
cdr2
instance-attribute
fr3
instance-attribute
cdr3
instance-attribute
fr4
instance-attribute
prefix
instance-attribute
postfix
instance-attribute
__init__(fr1, cdr1, fr2, cdr2, fr3, cdr3, fr4, prefix, postfix)
as_dict()
Return dict mapping segment names to sequences
Returns:
| Type | Description |
|---|---|
dict[str, str]
|
dict[str, str]: dict mapping ['fr1', 'fr2', ...] to their aminoacid sequences |