Chapter 9 HLA Typing
Tumor mutations can result in altered proteins, which might act as antigens and potentially elicit an immune response. Algorithms, including NetMHCpan, MHCflurry, and <href=“https://pubmed.ncbi.nlm.nih.gov/17608956/”>SMMalign, are commonly used to predict neo-antigen peptides. Knowing the HLA type is necessary to identify potential neo-antigens for targeted immunotherapy. Current alignment-based HLA typing methods require DNA or RNA sequencing inputs and predict HLA-I class only or both of HLA-I and HLA-II classes. For example, arcasHLA, HLAProfiler, and seq2HLA have been developed to perform high-solution HLA typing from RNA-seq data. OptiType and PHLAT are other HLA identification tools for RNA, whole-exome, and whole-genome sequencing data. HLA reference sequences can be obtained from the ImMunoGeneTics (IMGT) database.
9.1 Identify HLA type
HLA Typing is a part of the neoantigen prediction module of RIMA.
RIMA uses arcasHLA to predict HLA types of both MHC Class I & Class II from the bulk RNA-seq data. The sorted alignment BAM files generated by STAR are used for input to arcasHLA. Here we use one sample from the Zhao trial as an example. We first extract the HLA reads from the alignment file:
### Extract fastq reads
arcasHLA extract analysis/STAR/SRR8281218/SRR8281218.sorted.bam -t 16 -v --sample SRR8281218 -o analysis/neoantigen/SRR8281218
### Output from extraction
analysis/neoantigen/SRR8281218/SRR8281218.extracted.1.fq.gz
analysis/neoantigen/SRR8281218/SRR8281218.extracted.2.fq.gz
Then we identify the HLA alleles using the extracted reads:
arcasHLA genotype analysis/neoantigen/SRR8281218/SRR8281218.extracted.1.fq.gz analysis/neoantigen/SRR8281218/SRR8281218.extracted.2.fq.gz -g A,B,C,DQA1,DQB1,DRB1 -t 16 -v -o analysis/neoantigen/SRR8281218
###Output from extraction
analysis/neoantigen/SRR8281218/SRR8281218.genotype.json
cat analysis/neoantigen/SRR8281218/SRR8281218.genotype.json
###
{"A": ["A*26:01:01", "A*03:01:01"], "B": ["B*35:01:01", "B*07:02:01"], "C": ["C*07:02:01", "C*04:01:01"], "DQA1": ["DQA1*02:01:01", "DQA1*03:01:01"], "DQB1": ["DQB1*03:02:01"], "DRB1": ["DRB1*04:02:01", "DRB1*07:01:01"]}
###
Merge individual HLAs RIMA also merges the individual HLA results from arcasHLA into a summary file:
subject A1 A2 B1 B2 C1 C2 DQA11 DQA12 DQB11 DQB12 DRB11 DRB12
SRR8281238 A*01:01:01 A*02:01:01 B*35:01:01 B*08:01:01 C*07:01:01 C*04:01:01 DQA1*05:01:01 DQA1*01:01:01 DQB1*02:01:01 DQB1*05:01:01 DRB1*01:01:01 DRB1*03:01:01
SRR8281233 A*01:01:01 A*02:01:01 B*57:01:01 B*44:02:01 C*05:01:01 C*06:02:01 DQA1*01:02:01 DQA1*03:01:01 DQB1*06:02:01 DQB1*03:02:01 DRB1*04:01:01 DRB1*15:01:01
SRR8281236 A*33:01:01 A*24:02:01 B*14:02:01 B*15:01:01 C*06:02:01 C*08:02:01 DQA1*01:02:02 DQA1*03:01:01 DQB1*03:02:01 DQB1*05:02:01 DRB1*04:03:01 DRB1*16:02:01
SRR8281243 A*01:01:01 A*24:02:01 B*35:02:01 B*41:01:01 C*04:01:01 C*17:01:01 DQA1*01:02:01 DQA1*01:05:01 DQB1*05:01:01 DQB1*06:09:01 DRB1*10:01:01 DRB1*13:02:01
SRR8281251 A*24:02:01 A*01:01:01 B*35:02:01 B*41:01:01 C*04:01:01 C*17:01:01 DQA1*01:02:01 DQA1*01:05:01 DQB1*05:01:01 DQB1*06:09:01 DRB1*10:01:01 DRB1*13:02:01
SRR8281230 A*01:01:01 A*02:01:01 B*57:01:01 B*44:02:01 C*05:01:01 C*06:02:01 DQA1*03:01:01 DQA1*01:02:01 DQB1*03:02:01 DQB1*06:02:01 DRB1*04:01:01 DRB1*15:01:01
SRR8281250 A*01:01:01 A*02:01:01 B*35:01:01 B*08:01:01 C*07:01:01 C*04:01:01 DQA1*05:01:01 DQA1*01:01:01 DQB1*02:01:01 DQB1*05:01:01 DRB1*01:01:01 DRB1*03:01:01
SRR8281244 A*25:01:01 A*02:01:01 B*18:01:01 B*08:01:01 C*07:02:01 C*12:03:01 DQA1*01:02:01 DQA1*01:02:01 DQB1*06:02:01 DQB1*06:02:01 DRB1*15:01:01 DRB1*15:01:01
SRR8281218 A*26:01:01 A*03:01:01 B*35:01:01 B*07:02:01 C*07:02:01 C*04:01:01 DQA1*02:01:01 DQA1*03:01:01 DQB1*03:02:01 DQB1*03:02:01 DRB1*04:02:01 DRB1*07:01:01