Running FDR Correction on Probability Maps Using R


Written June 11, 2007 by Liberty Hamilton. Updated 02/17/2009 by Owen Phillips. Email Dr. Katherine Narr if you have any questions.

 

False Discovery Rate (FDR) Correction is a statistical method used in multiple hypothesis testing to correct for type I errors (false positives) in multiple comparisons. To run FDR with the batch commands specified below, you must have already generated probability maps in R. You can then provide either a p-value cutoff or a q-value to determine the false discovery rate for any given analysis. The output of these scripts will be a text file listing the FDR as a percentage value. If you wish to perform an FDR and output .ucf files as well as corrected p-values, you will have to use the anova_shape_fdr_hemispheres_batch.R batch script.

 

Running on the Command Line

To run on the command line, you must first make sure that your account is set up to use R on cranium ("the grid"). To do this, follow these steps:

  1. In your home directory, make a backup of your .Renviron file that configures to run on inire (you might not have one).
  2. Copy the .Renviron file from ~woods:
    cp ~woods/.Renviron ./
  3. Connect to cerebro-sn1. ssh -X user@qsub.loni.ucla.edu
  4. source /usr/sge/loni/common/settings.csh (if you are using the .csh shell. This may not be necessary, but won't do any harm if you do run it.)

Side note: Notice that .Renviron is a hidden file, so you will not see it in your directory by running a simple ls. To list all of the files in a directory, including hidden files that start with a dot (.), type ls -a.

There are two FDR batch scripts that you can use: (1) compute_fdr_for_p_cutoff_batch.R and (2) compute_fdr_threshold_batch.R. These are both located in /ifs/woods/rshape/batch_commands/.

 

Compute FDR for p cut-off

This batch script allows you to specify a p cut-off value and get false discovery rates for areas on your UCF that are significant to p less than or equal to a given cut-off. To run on the command line, you will need an input list file with the files for analysis. The input will look like this:

File
/fullpathtoyourdirectory/10101_genofx_L.ucf
/fullpathtoyourdirectory/10101_genofx_R.ucf

You must either do both hemispheres together, or if you want to analyze hemispheres separately, you will need to make text files with the hemisphere duplicated, like so:

File
/fullpathtoyourdirectory/10101_genofx_L.ucf
/fullpathtoyourdirectory/10101_genofx_L.ucf

Note that there MUST BE a carriage return after the last ucf file.

 

Sample command line

/ifs/woods/R/R-2.7.2/bin/R CMD BATCH --no-save --no-restore --quiet --args -listInputlist.txt -plevel0.05 /ifs/woods/rshape/batch_commands/compute_fdr_for_p_cutoff_batch.R /mydirectory/FDR_outputp0-05.txt

 

Compute FDR for q cut-off

In statistics, the Q value is the minimum false discovery rate at which the test may be called significant. To compute false discovery rates for a q-value cut-off, use the compute_fdr_threshold_batch.R batch script in /ifs/woods/rshape/batch_commands/. To run on the command line, you will need a few things ready first. As before, you will need your input .txt file. Here is a sample command line (remember the qsub stuff goes first so that the job will be submitted to the grid):

 

Sample command line

qsub -b y -q long.q /ifs/woods/R/R-2.7.2/bin/R CMD BATCH --no-save --no-restore --quiet --args -list/yourdirectory/list.txt -qlevel0.05 /ifs/woods/rshape/batch_commands/compute_fdr_theshold_batch.R /yourdirectory/genotypeeffect_q0-05.txt

 

Running in the Pipeline--FDR not working on the Pipeline currently. Protocol will be updated soon. 02/18/2009

You can also do the same things described above by running FDR in the pipeline (also see the Pipeline User Guide for more help with pipeline related issues). This is very convenient, as you can provide lists of the text files and output files to run FDR correction in parallel for many jobs.

  1. First, you will need to download the latest version of the LONI Pipeline Processing Environment
  2. Then, you will need to download the FDR_correct.pipe for p cut-off pipeline workflow.
  3. Open up the pipeline file and look at the inputs and outputs. You will not need to change the values for the fields circled in the figure below. These inputs essentially set up R for execution, and tell the program that it will be running the batch command that is specified in the last input.
    Don't change the values in these fields.
  4. Notice that there is an empty input parameter circle and an empty output parameter circle. Enter your input file (a .txt file, created as specified above, or a .list file, which will be a list of your input .txt files with the full path to each. Use a .list file if you wish to do many cases in parallel). A .list file would look like this:

    /mydirectory/fdr_genotype_list.txt
    /mydirectory/fdr_sexfx_list.txt
    /mydirectory/fdr_genotype_rightonly_list.txt

    The fdr_*_list.txt files would look like the text files mentioned above. If you are using a .list file as your input, you must also have a .list output, which will be a list of output file names. Be sure to specify the "remote" option, with the LONI pipeline server (pipelinev4.loni.ucla.edu) as your remote server.

    Change list input.
  5. Enter your outputs in the same way as your input, and be sure to specify the "remote" option as before.
  6. There is also a p-value input parameter circle. Double-click on the p-value input circle and enter your p-value.
    Change p-value input.
  7. If you wish to run the FDR Threshold hemispheres batch script (thresholding with q values instead of p values, download the FDR_correct_qlevel.pipe pipeline module (right-click or control-click and select "Save Link As"). Perform the same steps as above, but enter the q-value in the q-value input and your text files as before.

Running anova_shape_fdr_hemispheres_batch.R

As mentioned earlier, you may not want to display uncorrected data in your results figures if your effect is very robust. In this case, you could run an ANOVA with FDR correction to obtain both the FDR and the .ucf file.

 

Example Command Line:

/ifs/woods/R/R-2.7.2/bin/R CMD BATCH --no-save --no-restore --quiet --args -table/path/LeftTextFile.txt -table2/path/RightTextFile.txt -formulay~sex+age+group -reducedy~sex+age -output/path/LFDRoutput.ucf -output2/path/RFDRoutput.ucf /ifs/woods/rshape/batch_commands/anova_shape_fdr_hemispheres_batch.R /path/LR_FDR_output.txt

 

References

  1. Genovese CR, Lazar NA, Nichols T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. NeuroImage 2002; 15: 870-878.
  2. Storey JD. A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2002; 64: 479-498.
  3. Storey JD, Taylor JE, Siegmund D. Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2004; 66: 187-205.