Chapter 9 HLA Typing

Tumor mutations can result in altered proteins, which might act as antigens and potentially elicit an immune response. Algorithms, including NetMHCpan, MHCflurry, and <href=“https://pubmed.ncbi.nlm.nih.gov/17608956/”>SMMalign, are commonly used to predict neo-antigen peptides. Knowing the HLA type is necessary to identify potential neo-antigens for targeted immunotherapy. Current alignment-based HLA typing methods require DNA or RNA sequencing inputs and predict HLA-I class only or both of HLA-I and HLA-II classes. For example, arcasHLA, HLAProfiler, and seq2HLA have been developed to perform high-solution HLA typing from RNA-seq data. OptiType and PHLAT are other HLA identification tools for RNA, whole-exome, and whole-genome sequencing data. HLA reference sequences can be obtained from the ImMunoGeneTics (IMGT) database.

9.1 Identify HLA type

HLA Typing is a part of the neoantigen prediction module of RIMA.

RIMA uses arcasHLA to predict HLA types of both MHC Class I & Class II from the bulk RNA-seq data. The sorted alignment BAM files generated by STAR are used for input to arcasHLA. Here we use one sample from the Zhao trial as an example. We first extract the HLA reads from the alignment file:

### Extract fastq reads
arcasHLA extract analysis/STAR/SRR8281218/SRR8281218.sorted.bam -t 16 -v --sample SRR8281218 -o analysis/neoantigen/SRR8281218

### Output from extraction
analysis/neoantigen/SRR8281218/SRR8281218.extracted.1.fq.gz
analysis/neoantigen/SRR8281218/SRR8281218.extracted.2.fq.gz

Then we identify the HLA alleles using the extracted reads:

arcasHLA genotype analysis/neoantigen/SRR8281218/SRR8281218.extracted.1.fq.gz analysis/neoantigen/SRR8281218/SRR8281218.extracted.2.fq.gz -g A,B,C,DQA1,DQB1,DRB1 -t 16 -v -o analysis/neoantigen/SRR8281218

###Output from extraction
analysis/neoantigen/SRR8281218/SRR8281218.genotype.json
 
 
cat analysis/neoantigen/SRR8281218/SRR8281218.genotype.json

###
{"A": ["A*26:01:01", "A*03:01:01"], "B": ["B*35:01:01", "B*07:02:01"], "C": ["C*07:02:01", "C*04:01:01"], "DQA1": ["DQA1*02:01:01", "DQA1*03:01:01"], "DQB1": ["DQB1*03:02:01"], "DRB1": ["DRB1*04:02:01", "DRB1*07:01:01"]}
###

Merge individual HLAs RIMA also merges the individual HLA results from arcasHLA into a summary file:

subject A1  A2  B1  B2  C1  C2  DQA11   DQA12   DQB11   DQB12   DRB11   DRB12
SRR8281238  A*01:01:01  A*02:01:01  B*35:01:01  B*08:01:01  C*07:01:01  C*04:01:01  DQA1*05:01:01   DQA1*01:01:01   DQB1*02:01:01   DQB1*05:01:01   DRB1*01:01:01   DRB1*03:01:01
SRR8281233  A*01:01:01  A*02:01:01  B*57:01:01  B*44:02:01  C*05:01:01  C*06:02:01  DQA1*01:02:01   DQA1*03:01:01   DQB1*06:02:01   DQB1*03:02:01   DRB1*04:01:01   DRB1*15:01:01
SRR8281236  A*33:01:01  A*24:02:01  B*14:02:01  B*15:01:01  C*06:02:01  C*08:02:01  DQA1*01:02:02   DQA1*03:01:01   DQB1*03:02:01   DQB1*05:02:01   DRB1*04:03:01   DRB1*16:02:01
SRR8281243  A*01:01:01  A*24:02:01  B*35:02:01  B*41:01:01  C*04:01:01  C*17:01:01  DQA1*01:02:01   DQA1*01:05:01   DQB1*05:01:01   DQB1*06:09:01   DRB1*10:01:01   DRB1*13:02:01
SRR8281251  A*24:02:01  A*01:01:01  B*35:02:01  B*41:01:01  C*04:01:01  C*17:01:01  DQA1*01:02:01   DQA1*01:05:01   DQB1*05:01:01   DQB1*06:09:01   DRB1*10:01:01   DRB1*13:02:01
SRR8281230  A*01:01:01  A*02:01:01  B*57:01:01  B*44:02:01  C*05:01:01  C*06:02:01  DQA1*03:01:01   DQA1*01:02:01   DQB1*03:02:01   DQB1*06:02:01   DRB1*04:01:01   DRB1*15:01:01
SRR8281250  A*01:01:01  A*02:01:01  B*35:01:01  B*08:01:01  C*07:01:01  C*04:01:01  DQA1*05:01:01   DQA1*01:01:01   DQB1*02:01:01   DQB1*05:01:01   DRB1*01:01:01   DRB1*03:01:01
SRR8281244  A*25:01:01  A*02:01:01  B*18:01:01  B*08:01:01  C*07:02:01  C*12:03:01  DQA1*01:02:01   DQA1*01:02:01   DQB1*06:02:01   DQB1*06:02:01   DRB1*15:01:01   DRB1*15:01:01
SRR8281218  A*26:01:01  A*03:01:01  B*35:01:01  B*07:02:01  C*07:02:01  C*04:01:01  DQA1*02:01:01   DQA1*03:01:01   DQB1*03:02:01   DQB1*03:02:01   DRB1*04:02:01   DRB1*07:01:01

9.2 Video demo