Removing samples with missing data at any of the phenotypes
phenotypes_apply_rint
false
Apply Rank Inverse Normal Transformation (RINT) to quantitative phenotypes in both steps
target_build
empty
Specify the desired target_build (hg19 or hg38) of your data. In case it’S different from the input data, lift-over is executed
Chunking
nf-gwas allows to split your prediction and association data in smaller chunks to utilize large-scale clusters.
Option
Default
Description
genotypes_association_chunk_size
0
Desired chunksize in bases. if 0, split by chromosomes in regenie_step2
genotypes_association_chunk_strategy
‘RANGE’
Chunking strategy by range or by variants (RANGE or VARIANTS)
genotypes_prediction_chunks
0
Chunking for regenie_step 1. If 0, no chunking of predictions
Annotation and GWAS Report
For annotation, a rsid file is required. If not set, this will be downloaded each time. You can also prepare the file once and set it in your pipeline. Substitute rsbuild with hg19 or hg38.
Annotation limit for interactive and static report
annotation_peak_pval
1.5
annotation_max_genes
20
Number of max annotated genes
Static Report
Option
Default
Description
plot_ylimit
0
Limit y axis in Manhattan/QQ plot for large p-values
manhattan_annotation_enabled
true
Use annotation for Manhattan plot
Pruning Step
Option
Default
Description
prune_enabled
false
Enable pruning step
prune_maf
0.01
MAF filter
prune_window_kbsize
1000
Window size
prune_step_size
100
Step size (variant ct)
prune_r2_threshold
0.9
Unphased hardcall R2 threshold
Quality Control (QC) of Predictions
Option
Default
Description
qc_maf
0.01
Minor allele frequency (MAF) filter
qc_mac
100
Minor allele count (MAC) filter
qc_geno
0.1
Genotype missingess
qc_hwe
1e-15
Hardy-Weinberg equilibrium (HWE) filter
qc_mind
0.1
Sample missigness
Convert VCF to PLINK format
Option
Default
Description
vcf_conversion_split_id
false
If false, family and individual IDs are set to the sample ID (using plink2 --double-id option). If true, split VCF by “_” into FID and IID (--id-delim)
Prediction Step (Regenie Step 1)
The following parameters are all regenie specific. Please click here to learn more about them.
Option
Default
Description
regenie_skip_predictions
false
Skip regenie Step 1 predictions
regenie_force_step1
false
Run regenie step 1 when >1M genotyped variants are used (not recommended)
regenie_bsize_step1
1000
Size of the genotype blocks
regenie_step1_optional
null
Add optional regenie step 1 params not directly supported
Single-variant and Gene-based Tests (Regenie Step 2)
The following parameters are all regenie specific. Please click here to learn more about them.
Option
Default
Description
regenie_bsize_step2
400
Size of the genotype blocks
regenie_firth
true
Use Firth likelihood ratio test (LRT) as fallback for p-values less than threshold
regenie_firth_approx
true
Use approximate Firth LRT for computational speedup
regenie_step2_optional
null
Add optional regenie step 2 params not directly supported
Single-variant Tests Only
Option
Default
Description
regenie_sample_file
empty
Sample file corresponding to input BGEN file
regenie_min_imputation_score
0.00
Minimum imputation info score (IMPUTE/MACH R^2)
regenie_min_mac
5
Minimum minor allele count
regenie_ref_first
false
Specify to use the first allele as the reference allele for BGEN or PLINK bed/bim/fam file input [default is to use the last allele as the reference]
Gene-based Tests Only
The following gene-based parameters are all regenie specific. Please click here to learn more about this feature.
Option
Default
Description
regenie_gene_aaf
1 %
comma-separated list of AAF upper bounds to use when building masks
regenie_gene_test
-
comma-separated list of SKAT/ACAT-type tests to run
regenie_gene_joint
-
comma-separated list of joint tests to apply on the generated burden masks
regenie_gene_build_mask
max
build masks using the maximum number of ALT alleles across sites, or the sum of ALT alleles (‘sum’), or thresholding the sum to 2 (‘comphet’)
regenie_write_bed_masks
-
write mask to PLINK bed format (does not work when building masks with ‘sum’)
regenie_gene_vc_mac_thr
10
MAC threshold below which to collapse variants in SKAT/ACAT-type tests
regenie_gene_vc_max_aaf
100%
AAF upper bound to use for SKAT/ACAT-type tests
Interaction Tests Only
The following interaction test parameters are all regenie specific. Please click here to learn more about this feature.
Option
Default
Description
regenie_rare_mac
1000
minor allele count (MAC) threshold below which to use HLM method for QTs
regenie_no_condtl
false
to print out all the main effects from the interaction model
regenie_force_condtl
false
to include the interacting SNP as a covariate in the marginal test