immunum
immunum.Annotator
Annotates antibody and T-cell receptor sequences with IMGT or Kabat position numbers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chains
|
list[str]
|
Chain types to consider during auto-detection. Each entry is a case-insensitive string. Accepted values:
Pass all chains you want to consider; the annotator scores each and picks the best-matching one. To consider every supported chain pass all seven values. |
required |
scheme
|
str
|
Numbering scheme to use for output positions. Accepted values (case-insensitive):
Note: Kabat is only supported for antibody chains (IGH, IGK, IGL). |
required |
min_confidence
|
float | None
|
Minimum alignment confidence threshold in the range |
None
|
Source code in immunum/__init__.py
180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 | |
_annotator
instance-attribute
__init__(chains, scheme, min_confidence=None)
Create an Annotator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chains
|
list[str]
|
Chain types to consider. See class docstring for accepted values. |
required |
scheme
|
str
|
Numbering scheme — |
required |
min_confidence
|
float | None
|
Reject sequences with alignment confidence below this
threshold. Defaults to |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If any chain or scheme value is unrecognised, if Kabat is
requested for TCR chains, or if |
Source code in immunum/__init__.py
number(sequence)
Assign IMGT or Kabat position numbers to every residue in a sequence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
Amino-acid sequence string (single-letter codes). |
required |
Returns:
| Type | Description |
|---|---|
NumberingResult
|
A |
NumberingResult
|
and a |
NumberingResult
|
set and all other fields are |
Source code in immunum/__init__.py
segment(sequence)
Split a sequence into FR/CDR regions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sequence
|
str
|
Amino-acid sequence string (single-letter codes). |
required |
Returns:
| Type | Description |
|---|---|
SegmenationResult
|
A |
SegmenationResult
|
and any unaligned |
SegmenationResult
|
|
Source code in immunum/__init__.py
immunum.NumberingResult
dataclass
Python dataclass containing numbering results. Allows for direct attribute access
via result.chain, result.numbering, etc.:
from immunum import Annotator
annotator = Annotator(
chains=["H", "K", "L"],
scheme="imgt",
)
sequence = "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYCAREGTTGKPIGAFAHWGQGTLVTVSS"
result = annotator.number(sequence)
assert result.chain == "H"
assert result.scheme == "IMGT"
assert isinstance(
result.confidence, float
)
assert result.numbering["1"] == "Q"
for (
position,
amino_acid,
) in result.numbering.items():
print(f"{position}: {amino_acid}")
# 1: Q
# 2: V
# 3: Q
# ...
Source code in immunum/__init__.py
chain
instance-attribute
scheme
instance-attribute
confidence
instance-attribute
numbering
instance-attribute
query_start
instance-attribute
query_end
instance-attribute
error
instance-attribute
__init__(chain, scheme, confidence, numbering, query_start, query_end, error)
immunum.SegmenationResult
dataclass
Python dataclass containing numbering results. Allows for direct atribute access
via results.fr1, and also for iterating through segmentation results via as_dict():
from immunum import Annotator
annotator = Annotator(
chains=["H", "K", "L"],
scheme="imgt",
)
sequence = "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYCAREGTTGKPIGAFAHWGQGTLVTVSS"
result = annotator.segment(sequence)
assert (
result.fr1
== "QVQLVQSGAEVKRPGSSVTVSCKAS"
)
assert result.cdr1 == "GGSFSTYA"
assert result.fr2 == "LSWVRQAPGRGLEWMGG"
assert result.cdr2 == "VIPLLTIT"
assert (
result.fr3
== "NYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYC"
)
assert result.cdr3 == "AREGTTGKPIGAFAH"
assert result.fr4 == "WGQGTLVTVSS"
for (
segment,
aminoacids,
) in result.as_dict().items():
print(f"{segment}: {aminoacids}")
# fr1: QVQLVQSGAEVKRPGSSVTVSCKAS
# cdr1: GGSFSTYA
# fr2: LSWVRQAPGRGLEWMGG
# cdr2: VIPLLTIT
# fr3: NYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYC
# cdr3: AREGTTGKPIGAFAH
# fr4: WGQGTLVTVSS
# prefix:
# postfix:
Source code in immunum/__init__.py
fr1
instance-attribute
cdr1
instance-attribute
fr2
instance-attribute
cdr2
instance-attribute
fr3
instance-attribute
cdr3
instance-attribute
fr4
instance-attribute
prefix
instance-attribute
postfix
instance-attribute
error
instance-attribute
__init__(fr1, cdr1, fr2, cdr2, fr3, cdr3, fr4, prefix, postfix, error)
as_dict()
Return dict mapping segment names to sequences (excludes error field)
Returns:
| Type | Description |
|---|---|
dict[str, Optional[str]]
|
dict[str, str | None]: dict mapping ['fr1', 'fr2', ...] to their aminoacid sequences |