SOLAR-Eclipse Genetics
Return to command index

A1.1 about


Purpose:  Copyright, authors, and disclaimers

SOLAR is Copyright (c) 1995-2014 Southwest Foundation for Biomedical
Research.  All rights reserved.

The authors are John Blangero, Kenneth Lange, Laura Almasy, Harald Goring,
Jeff Williams, Tom Dyer, Michael Boehnke, and Charles Peterson.  Parts of
SOLAR consist of software developed by others; a complete list is provided
in Appendix Four of the documentation included with this package (use the
"doc" command to find out where that is).

Use of this program should be acknowledged in scientific publications.

Commands, features, performance, and availability are subject to change.
There is absolutely no warranty, express or implied.
There is no committment to support scripts written using current commands
  in future releases.

Shortcuts: abou - about

Return to command index

A1.2 allcovar


Purpose:  Set up all non-trait variables as covariates

Usage:    allcovar

Notes:    Phenotypes and trait commands must already have been given.
          If there is a variable named "age," it will be set up
          as "age^1,2#sex."  If undesired variables are made into
          covariates, they should be removed with the covariate
          delete command.

          allcovar will not include names mapped to any of the standard
          field variables (see 'help field').  Be sure to set up field
          mappings (if required) first to ensure you don't get extra
          covariates for the likes of ID, FAMID, etc.

          allcovar will also not include any names added to the 'exclude'
          list.  Use the 'exclude' command to add names to the exclude
          list, or to display the exclude list.  By default, the exclude
          list includes some standard PEDSYS mnemonics

Shortcuts: allc - allcovars

Return to command index

A1.3 allsnp


Purpose:  Include all snps as covariates in current model

Usage:    allsnp

Notes:    allsnp includes all the phenotypes prefixed with snp_ or
          hap_ as covariates in the current model.  This is often the
          first step in a qtn analysis.  Afterwards, you can remove
          some snps using the "covariate delete" command.

          It is OK if you have already selected other covariates,
          including some of the snps.  Every covariate is only added
          once no matter what.

          allsnp looks at all currently loaded phenotype files.

Return to command index

A1.4 alnorm


Purpose:  Evaluates the tail of normal curve

Usage:    alnorm <x> [t | f]

          If "t", curve is evaluated from X to infinity
          If "f", curve is evaluated from minus infinity to X


Shortcuts: alnorm - alnorm

Return to command index

A1.5 annotate_gwas


Purpose:  Add annotated gene data to a gwas.out file generated by gwas command or gpu_gwas
    command.  The command only outputs snps with pvalues less than equal to threshold.
    Default threshold is 0.05
         
Usage: annotate_gwas  -i <gwas data filename> -a <annotate data filename> <annotate field list filename> 
		 -o <output filename>>  Optional:  -t <pvalue threshold>
		-i <gwas data filename> Contains the output from commands gwas or gpu_gwas
		-a <annotate data filename> <annotate field list filename>  The first 
		argument is the filename containing the snp annotation data.  The second
		argument is the file name containing the list of field names separated by
		commas.
		-o <output filename> Annotated gwas data is written out to the file name
		specified with this argument
		-t <pvalue threshold>  This is the threshold that determines whether a snp within the input
		file will be written to output.  Any snp with a pvalue less than or equal
		to the threshold will be included.



Return to command index

A1.6 ascertainment


Purpose:  Describe ascertainment correction using proband(s)

Ascertainment correction is available through use of the proband field in
the phenotypes file.  Ascertainment correction by conditioning on
probands is automatically performed if there is a field named 'proband',
'probnd', or 'prband' (in upper or lower case) in the phenotypes file.  
Probands are those individuals through whom the pedigree has been 
ascertained.

In a proband field, blank ( ) or zero (0) signifies non-proband (normal)
status, and anything else signifies proband status.  No decimal point is
permitted after the zero.

If your proband field is named something else, the safest approach is to
modify your data files accordingly.  If that is not possible, you can use
the SOLAR 'field' command to map your name.  For example, if your proband
field is actually named 'Affected', you would use the following command:

    solar> field probnd Affected

(Note that the name 'probnd' is used as a field type selector because that
 is the PEDSYS standard mnemonic.)

For routine use, such a field command should be included in a .solar
startup file or user script.  Field mappings are not included in model
files.  For that reason, it may be safest to modify the code or data file
if possible.

Conversely, if your file has a probnd field but you wish it to be ignored,
you can rename the probnd field or give a command like the following:

    solar> field probnd -none

Proband individuals are required to have all the quantitative
variables required of other individuals to be included (as probands) in
the analysis.  Probands who are missing any quantitative variables are
not included in the Proband Count and except for defining the pedigree
structure do not enter into any calculations.

Unlike the program Fisher, SOLAR does not require probands to be at the
beginning of pedigrees, and does not require you to provide a "proband
count."  Other than that, SOLAR uses the ascertainment correction
algorithm built-in to Fisher.

SOLAR prints a Proband Count in the summary statistics, which are written
to maximization output files and to the terminal unless the verbosity
level is set low.

Shortcuts: ascer - ascertainment

Return to command index

A1.7 automodel


Purpose:  Default model setup

Usage:    automodel <phenotypes> <trait>
             phenotypes is the name of phenotype file
             trait is the name of trait
                 (all other variables will be used as covariates)

Notes:   1.  Automodel will create a new model, with all non-trait and
             non-pedigree variables as covariates (see note 2).

         2.  The pedigree-related fields listed by the 'field' command
             will not be used as covariates (except for SEX, which will be).
             Certain other standard PEDSYS names are in the default exclude
             list.  You can add additional items to the exclude list with
             the exclude command.  See 'exclude' and 'allcovar' help.

         3.  Boundaries and starting points are set automatically by the
             maximize command.

         4.  You can pick and choose from the commands that automodel uses
             if you want to do things differently. Here is the body of 
             automodel:

                 model new                   ;# Start a new model
                 phenotypes load filename    ;# load phenotypes file
                 trait traitname             ;# assign trait variable
                 allcovar                    ;# assign covariates
                 polymod                     ;# set polygenic model type

Shortcuts: autom - automodel

Return to command index

A1.8 bayesavg


Purpose:  Perform bayesian oligogenic model averaging
          on covariates or linkage components of the current model.

Usage:    bayesavg [-cov[ariates]] [-ov[erwrite]] [-redo]
                   [-max <max>] [-cutoff <cutoff>] [-symmetric]
                   [-list <listfile>] [-fix [cov|param]]
                   [-size_log_n] [-nose] [-old_log_n]
                   [-sporadic] [-h2rf h2r_factor] [-saveall]
                   [-qtn] [-stop] [-nostop]

          bayesavg -r[estart]   ;# (see also -redo)

  SPECIAL NOTE:  THE ALGORITHMS CHANGED in VERSION 1.7.3.  SEE NOTES 1-4.
                 NUMERIC RESULTS MAY DIFFER FROM PREVIOUS VERSIONS.

              -covariates (or -cov)  Perform bayesian model averaging
                 on the covariates only.  (The default is to perform
                 bayesian model averaging on the linkage elements.)

              -overwrite (or -ov) means force overwrite of existing output
                 files

              -max  Only include this number of components, or fewer,
                    at one time.  This reduces the number of models
                    enormously, particularly for large N.

              -list file contains a list of the elements to use.
                    There is one line for each covariate or linkage
                    parameter.  Remaining covariates or linkage parameters
                    in the starting model are automatically fixed.
                    Covariates need not be present in the starting model,
                    but linkage parameters (and their matrices, etc.) must
                    be included in the starting model.

              -fix (or -f) fix (lock in) this covariate.  A fixed element
                 covariate (specified by covariate name, e.g. "age") or 
                 linkage element (by linkage parameter name, e.g. "h2q1")
                 is carried through all models.  (Note: a -fix or -f 
                 qualifier is required for each covariate to be fixed,
                 for example:  -f age -f sex.)  When fixed elements are
                 included, it is adviseable to run "polygenic" on the
                 starting model first.
                  
              -cutoff (optional) sets the BIC limit for occam's window 
                 (default: 6) 

              -log_n specify the log(n) to use.  Normally this is first
                 estimated from the samplesize of the unsaturated model,
                 then recalculated from the standard deviation of the
                 mean and it's standard error in the model with the best BIC.

              -symmetric (or -sym) use "symmetric" Occam's window.
                  The default is a "strict" Occam's window, which excludes
                  superset models with higher BIC; symmetric Occam's window
                  includes ALL models within BIC cutoff.

              -stop   Stop when no models in the last group with the same
                      size (degrees of freedom) have entered the window.
                      (This is the default for -qtn.)

              -nostop  Do not stop when no models in the last group with
                       the same size have entered the window.  (Useful
                       for overriding the default for -qtn.)  If -stop
                       or -qtn is specified, however, the report if any
                       models have entered the window is still given.

              -restart (or -r) means restart previous bayesavg run that was
                 terminated before completion.  This begins with the
                 model after the last one in the output file.  Do not use
                 -restart if last run completed.  When restarting, set
                 the trait or outdir, then give the command "bayesavg
                 -restart" with no other arguments.  The original model
                 and other arguments are automatically recalled.
                 Previous command arguments are read from
                 bayesavg.command and the previous starting model is
                 c.orig or cov.orig.  If you need to change anything, use
                 the -redo option instead.  You will also need to use
                 the -redo option if the first pass through all models
                 completed, or if the bayesavg was started under
                 a previous version of SOLAR.

               -redo is a special form of restart that allows you to change
                 some options.  Unlike -restart, -redo REQUIRES YOU TO
                 SPECIFY ALL OPTIONS AND LOAD ORIGINAL STARTING MODEL.
                 Only models not already found in the output file will be
                 maximized.  

                 There are several cases where you must use -redo instead
                 -restart:  (1) If you need to
                 re-maximize models which had convergence problems
                 previously (edit them out of bayesavg*.est file, change
                 boundaries, then -redo).  (2) If previous bayesavg run
                 completed but you want to try a different window cutoff or
                 type.  (3) You deleted all files except the bayesavg.est
                 file.  (4) You need to restart from a previous version of
                 SOLAR.  Unlike -restart, with -redo you must set up the
                 starting model and commands either as they were previously
                 or with desired changes.  Since you must set up the
                 original model EXACTLY, and specify options either EXACTLY
                 as they were originall specified, or with the desired
                 changes, you are advised to use this option carefully.
                 It is a good idea to make a backup copy of the outdir
                 first.

              -saveall will force the saving of all models.  Normally only
                 the models within Occam's window are saved.  (Note:
                 models outside the window will not have standard errors.)

              -size_log_n Use the log(n) estimated from sample size as the
                          final log(n).  This bypasses the computation of
                          log(n) from the S.E. of the SD parameter of the
                          model with the best BIC.

              -nose       Do not compute standard errors for any models
                          (normally they are only computed for models
                          in the window).  Unless you specify a particular
                          -log_n, the log(n) estimated from sample size
                          will be used (as with -size_log_n).

              -old_log_n  This calculates log(n) the old fashioned way,
                          using the saturated model for covariate analysis
                          or the unsaturated model for linkage analysis.
                          This option is provided for comparison with
                          earlier releases, and may soon be removed.

              -h2rf (optional) is used to set upper bound of H2r
                 (default: 1.1)  See notes below.  Use of this option
                 is now unnecessary because of automated boundary control.

              -sporadic  This option is depricated.  Force all models
                 to sporadic.  Valid only with -covariate.  Now you can
                 accomplish the same thing by making the starting model
                 sporadic.

              -qtn   Quantitative Trait Nucleotide Analysis:
                     A "covariate" analysis is done with "-stop" in effect.
                     Covariates with name snp_* or hap_* are automatically
                     included but other covariates are excluded.  A special
                     "windowfile" named bayesavg_cov.win is also
                     produced.  The -stop default can be overridden with
                     -nostop.  To include all snps in the starting model,
                     use the separate command "allsnp".

Output:   In addition to the terminal display, the following files are
          created (<outname> is "bayesavg" for linkage analysis or 
          "bayesavg_cov" for covariate analysis):

          <outname>.avg         Final averaged results
          <outname>.out         Final BIC and other info for each model
                                  (standard errors for models in window)
          <outname>.history     History of analysis messages
          <outname>.est         Estimated BIC for each model (pass 1)
          <outname>.nose        Final BIC but no standard errors (pass 2)

          Models are saved with "c" <prefix> for linkage analysis and "cov"
          prefix for covariate analysis:

          <prefix>0.mod         Unsaturated model, with standard errors
          <prefix>1.mod         Model with element 1 (if saved)
          <prefix>12.mod        Model with elements 1 and 2 (if saved)
          <prefix>12_11.mod     Model with elements 1, 2, and 11.
          <prefix>.orig.mod     Original user model when started
          <prefix>.start.mod    Base model (unsaturated) before maximization
          <prefix>.base.mod     Maximized base model

Notes:    1)  bayesavg determines the number of variable (non-fixed)
              elements and sets N automatically.  N and the number of
              models are reported near the beginning.  A new algorithm
              is used to determine all the element combinations; this
              results in a more logical ordering in which the smallest
              models are evaluated first.

          2)  The first pass through all models is done with an approximate
              log(n) computed from the sample size.  The resulting file
              is bayesavg.est (or bayesavg_cov.est).  The final log(n) is
              then computed from the model with the best BIC, and all
              BIC's are recalculated with the resulting file being
              bayesavg.nose (or bayesavg_cov.nose).  Then, standard
              errors for only the models within Occam's window are
              recalculated.  The resulting final output file is
              bayesavg.out (or bayesavg_cov.out).  The output summary
              averages are reported in bayesavg.avg (or
              bayesavg_cov.avg).  This is a new algorithm designed to
              save time (by only calculating standard errors from the
              models in the window), be more robust, and give more
              accurate results.  Results may differ somewhat from those
              in earlier versions (prior to 1.7.3) of SOLAR.  Additional
              history of the analysis (the messages starting with "***")
              are saved in bayesavg.history (or bayesavg_cov.history).


          3)  To permit special models (with household effects, epistasis,
              etc.) to be handled, bayesavg no longer forces the starting
              model to be sporadic first.  It merely maximizes the current
              model, with all non-fixed elements removed, but with no
              change(s) to the starting omega or constraints.
              If the starting model cannot be maximized, the user is
              advised to run "polygenic" first.  Running "polygenic"
              first is probably a good idea in all -covariate cases,
              particularly if there are non-fixed elements.

          4)  Models are now "built-up" from the unsaturated model
              rather than being "constrained down" from the saturated
              model.  The unsaturated model itself is usually created
              by "constraining down" the starting model.

          5)  bayesavg may not support bivariate models.

Shortcuts: bayesa - bayesavg

Return to command index

A1.9 benice


Purpose:  Lower priority of SOLAR to allow more CPU for other jobs
          or lower priority of one SOLAR run relative to another

Usage:    benice              ; Set "nice" level to 15
          benice <LEVEL>      ; LEVEL is between 1 and 20
                              ; 20 is "most nice"

Notes:    This is intended for use on Unix systems which support the
          "renice" command, including Solaris 2.5 and above

          Once you have set a nice level, you cannot go back to a 
          higher priority on this process.  You must exit and restart.

          The default unix scheduling allows some time even for
          very "nice" jobs.  However, they get somewhat less CPU than
          other jobs.

          On the SFBR Ranch, scheduling is absolute, so that "nice"
          jobs will be suspended until all other jobs are done (or
          waiting for a system resource such as disk access).  Nice
          jobs have minimal (<1%) impact on other jobs, unless they
          hog huge gobs of memory.

Shortcuts: beni - benice

Return to command index

A1.10 blank


Purpose:  Blank individuals according to variable data conditions

Usage:    blank [-o] [-q] [-n] [<conditional expression>]

<conditional expression> can be any solar variable expression (as allowed
by the define command for covariates) that adds up to zero or non-zero.
If it adds to non-zero for a person, that person is removed from the sample.

          [-q]  Go about blanking quietly.

          [-o]  Force overwrite of existing definition having same name
                (see below for example of definition naming).

          [-n]  Make new definition name if this would otherwise
                conflict with existing definition name

          With no arguments, blank shows blanking definitions currently in
          effect.  To see all the definitions available, use define command.

Examples:

          blank class!=1           ;# include only class=1 in sample

          blank age<<55 + sex==1   ;# blank all but old guys

          blank age>=55 * sex==2   ;# blank only old guys

Notes:

1.  blank creates a definition and a null covariate to achieve the
    desired blanking.  It shows you what it does, and then suggests
    how this blanking may be deleted:

    solar> blank age<<55 + sex==1

    define blank_age = blank * (0!= (age<<55 + sex==1)
    covariate blank_age()
    To delete: covariate delete blank_age()

    solar>

2.  blanking is cumulative through the effect of all blanking covariates
    that remain in effect.  If you choose a condition which would create
    the same name as used by a previous condition  (see example above) it
    will raise an error.  You can force overwrite with -o.

3.  To restrict sample based on available of some variable, use a regular
    null covariate for that variable, as documented for the covariate
    command, for example:

    covariate age()

    null covariates (having following empty parentheses) are not included
    in likelihood estimation, but are used to delimit the available sample,
    just as blanking covariates are.

4.  You may also create covariate definitions just like blank does.  But
    be careful because it is easy to do it wrong.

Return to command index

A1.11 boundary


Purpose:  Change artificial boundary heuristics

Usage:   boundary                                     ; show settings
         boundary wide [start|off]                    ; wide boundaries
         boundary null [on|off]                       ; use boundaries from
                                                      ;  null model
         boundary start upper <term> [<term> ...]     ; Initial upper bounds
           <term> ::== <number> or <number>*h2r
         boundary float upper <number>                ; Later upper bounds
         boundary change <number>                     ; Amount to change by
         boundary crunch <number>                     ; Crunch bounds +/-
         boundary quadratic tol <number>              ; quadratic tolerance
         boundary max crunch <number>                 ; Maximum crunches
         boundary h2r factor <number>                 ; Bound h2r
         boundary e2 squeeze <number>                 ; Bound e2
         boundary trace [off]                         ; Trace upper bounds
         boundary hints                               ; More discussion
         boundary cov retries <number>                ; Max covar retries
         boundary cov incr <number>                   ; On each retry,
                                                      ;   increase cov bound
                                                      ;   by this factor
Examples:
         boundary start upper .2 .1 .05
         boundary float upper .05

Notes:   To function properly, the maximization algorithm used by SOLAR
         needs a little bit of help in the form of artificial boundaries.
         In general, any variance component can assume a any value from
         0.0-1.0, but in any particular case the range is more limited, and
         artificially limiting the range helps SOLAR maximize successfully.

         A set of heuristics and retry algorithms has been developed for
         SOLAR to set and adjust artificial boundaries.  The heuristics
         should not normally require adjustment.  If they do, please send
         a message to solar@txbiomedgenetics.org so we can improve SOLAR.

         You will know if you are having trouble with the boundary
         heuristics becase you will get 'Boundary' or 'Convergence'
         errors.  Beginning with SOLAR version 1.4.0, you will not get
         'Boundary' errors for variance components because SOLAR will
         automatically increase the boundaries up to the theoretic limits
         (0.0-1.0) as required.  If you get 'Convergence' errors, you
         should try setting some of the heuristics to lower values than
         they have already.  In addition to these heuristics, there are
         now also built-in retry mechanisms which automatically increase
         bounds if they are found to be too small, or decrease bounds if
         they are too big (being too being can cause convergence
         problems).  SOLAR will always discover if bounds are set too
         small to find the correct result and increase them, but it may
         not be able to deal automatically with bounds that need to be
         set very close to the correct result.

         If you get Boundary errors for covariates, you can deal with
         them in one of two ways.  For one, you can simply set the
         covariate upper and lower bounds in the starting model to
         reasonable values using the 'parameter' command and then re-run
         the analysis.  Or, you can use the 'boundary cov retries' or
         'boundary cov incr' commands to adjust the covariate boundary
         retry mechanism (which is separate from the mechanism for
         variance component boundaries).  Covariate bounds do not
         have obvious theoretic limits, so it is impossible to
         automatically prevent all covariate boundary errors.


   boundary wide on    ... set wide boundaries for variance components (N/A)
   boundary wide start ... set wide boundaries at start of chromosome
   boundary wide off   ... use standard boundary heuristics

         "boundary wide on" causes the boundaries for future linkage
         models to be set to the full natural range (lower 0 upper 1).
         This has no effect on the model currently in memory, but will be
         applied to future linkage models created by the multipoint,
         twopoint, linkmod, and linkmod2p commands.  This supercedes the
         standard variance component heuristics ("boundary start upper,"
         "boundary float upper," "boundary h2r factor," and "boundary e2
         squeeze") and also sets "boundary null off."  THIS OPTION IS NOT
         YET AVAILABLE (use "boundary wide start" instead).

         "boundary wide start" causes the boundaries for future linkage
         models to be set to the full natural range for the first QTL on
         each chromosome.  After the first QTL, the standard heuristics
         are applied.  (For twopoint linkage, this is the same as
         "boundary wide on")

         Both "boundary wide on" and "boundary wide start" turn off
         "boundary null on," as the options are incompatible.

         "boundary wide off" restores the usual variance component
         boundary heuristics for future linkage models.  "boundary wide
         off" does not necessarily restore the exact boundaries
         previously in use, and it does not restore "boundary null on" if
         that had previously been in effect.

   boundary null on  ... set boundaries according to null model
   boundary null off ... back to standard boundary heuristics

         "boundary null on" causes the boundaries for future linkage
         models to be taken from the current null model.  In pass 1 of
         multipoint, for example, the boundaries would be taken from
         null0.mod, and in pass 2, they would be taken from null1.mod.
         In cases of persistent convergence failure, you can edit the
         boundaries in the null model and use "multipoint -restart"
         to attempt to resolve the jam.

         "boundary null on" turns off "boundary wide on" and
         "boundary wide start," if they had been operative, because the
         options are incompatible.

         In the case of h2q* parameters not defined in the null model
         (for example, h2q2 will not be defined in null1.mod, though it
         is required for all two-linkage models), the default is to
         use the boundaries for the previous h2q parameter.  SOLAR
         always defines h2q1 in null0.mod.

        "boundary null off" restores the usual boundary heuristics for
         future linkage models.  It does not restore "boundary wide start"
         or "boundary wide on" if those had been in effect previously.
         
   boundary start upper ... set upper bound starting point for h2q's
   boundary float upper ... set upper bound based on previous h2q value

         These commands apply to the upper bounds of h2q* parameters
         (e.g. h2q1).  The default values are deliberately chosen to
         be quite low because they are automatically raised as required
         by a retry mechanism.  If the starting values were set to high,
         convergence errors could occur, and the mechanism for handling
         convergence errors is not as robust because it doesn't know which
         boundaries to adjust.

         'boundary start upper' sets the starting value for the upper
         bound of each new h2q parameter at the beginning of each
         chromosome.  This can be set as a single number (0.0 - 1.0) or
         as a term including 'h2r' (such as 0.8*h2r, which is the default).
         (H2r will be taken from the preceding null model if one is found.
         For example, if there is one linkage component, the null model is
         null0.out, which contains no linkage components.  If there are
         two linkage components, the null model is null1.out which contains
         one linkage component.)  Multiple values can be specified, one
         for each multipoint scan.  The last value specified applies to
         all remaining scans.  The default value of 0.8*h2r means that
         the upper bound for each new linkage component is set allowing
         for 80% of the current residual heritability to be accounted for
         by the first locus.

         'boundary float upper' sets the value for the upper bound of
         the newest h2q parameter after the beginning of each chromosome.
         The upper bound floats above each previously maximized h2q
         value by this amount, which defaults to 0.1.

   boundary change

         'boundary change' sets the value by which a bound changes
         after a boundary condition is detected.  Upper bounds will be
         raised and lower bounds will be lowered by this amount.  The
         default value is 0.1.

   boundary crunch

         'boundary crunch' sets the boundaries around each variance
         component if a convergence error occurs and then invokes a retry.
         The default value is 0.1.  For example, if the previous value for
         h2r was 0.3, the new boundaries will be set at 0.2 and 0.4.
         Boundary crunch is only applied after convergence errors, after
         which the boundaries can expand again through the retry mechanism.

   boundary max crunch

         'boundary max crunch' sets the limit on the number of crunch
         attempts for each locus.  Any given crunch may be followed by a
         series of boundary expansions, so multiple crunches may be
         required.  The default is 10, to give a large reasonable chance
         of success (if success is going to be achievable).  Two crunches
         in a row are never permitted (that would be meaningless).

   boundary quadratic tol

         The normalized quadratic (for quantitative traits only) is
         normally required to be between 0.999 and 1.001.  For some
         problems, this is unrealistic.  To change the tolerance to
         +/- 0.01 (0.99-1.01), you would give the command:

              boundary quadratic tol 0.01

         The allowed range is 0 - 1.

   boundary h2r factor

         'boundary h2r factor' sets an upper bound for h2r based on the h2r
         value in the null model.  The default value of 1.1 means that
         h2r is allowed to grow to 1.1x the size it had in the null model.
         So far as I know, this has never needed adjustment.  In any case,
         if it is too small, the automatic retry system will handle it.

   boundary e2 squeeze

         'boundary e2 squeeze' sets boundaries for e2 based on the previous
         e2 value.  The default value of 0.1 means that e2 is allowed to
         deviate +/- 0.1 from the preceeding value.

   boundary trace [off]

         'boundary trace' enables a trace of the upper bound applied to the
         newest h2q for each locus, and shows all retries and perturbations.
         This feature may be shut off with 'boundary trace off'.

   boundary cov retries <integer>

         'boundary cov retries' sets the maximum number of retries during
         which the  covariate boundaries are increased.  The default is
         10.

   boundary cov incr <number>

         'boundary cov incr' sets the factor controlling the amount by 
         which a covariate boundary is increased during a retry.  The
         default is 5, which results in at least a five-fold increase
         on each retry.  (The actual increase depends on the difference
         between both boundaries, and so will be larger than 5 in the
         beginning.  This is subject to change.)
-          

Shortcuts: bou - boundary

Return to command index

A1.12 boundary-notes


Purpose:  Discuss boundary error resolution strategies

This is an extension of the help provided for the 'boundary' command, which
you should read first.

When convergence errors occur during a multipoint scan, scanning will
terminate at the end of the scan regardless of whether some LOD scores
reached criterion levels or not, and a message like the following will be
displayed on the terminal (and printed to the multipoint.out file):

   *** Exiting because convergence errors occurred in last pass

Also, to the terminal and the applicable output file for the scan, an
error code will be appended to the end of each line on which an error
occurred, for example:

chrom 18  loc     0      0.0000    -2203.917  0.022568  0.268372 ConvrgErr

The code "ConvrgErr" indicates that a Convergence Error occurred such
that it was impossible to find a good maximum likelihood estimation.

Beginning with version 1.4.0, SOLAR now uses a retry mechanism so that
boundary errors (related to variance components) will not occur.
Boundaries will be increased incrementally until their theoretic limits
are reached.  It is still possible that convergence errors might occur,
and those may be controlled with the boundary command.

Boundary errors related to covariates are also handled with a retry
mechanism controlled by the 'boundary cov retries' command and the
'boundary cov incr' command.  The default values should work in almost
every case, but it is not possible to say they will always work because
with covariates there are no theoretic limits.

If convergence errors occur, you should use the 'boundary' command to
lower the applicable artificial boundary setting heuristic.  For example,
if the error is at the beginning of a chromosome, you should use the
'boundary start upper' command to set a lower value than the default.

All you need to do when these errors occur during a multipoint scan is
to "restart" the scan after resetting the applicable heuristic.  The
restart will detect those models for which an error occurred, and
redo them with the new heuristics.  For example:

solar> boundary start upper 0.1 0.05
solar> boundary float upper 0.05
solar> boundary change 0.05
solar> multipoint 3 -restart

(In earlier releases, you had to edit out the models for which errors
 occurred in the multipoint1.out file and then restart.  Now SOLAR
 recognizes models for which errors occurred and will redo them by
 default.)

Return to command index

A1.13 build_grm


Purpose: Creates a empirical pedigree matrix from a plink data set 

Usage: pedifromsnps -i <input base name of plink data> -o <output csv file name>
       --freq <file made with plink_freq>
       [optional: -corr <alpha value>  -per-chromo -king -method_two -normalize
  -batch_size <batch size value> -id_list <file w/ subject IDs>
  -n_threads <number of CPU threads>]

 -i The base file name of the plink .bed, .bim, and .fam files.
 -o The base file name for the output.
   -freq Name of output file from plink_freq command.
   -n_threads Number of CPU threads used for matrix calculation. 
      Default: Automatically set based on hardware
   -per-chromo Outputs a separate matrix for each chromosome. Default: Disabled
   -corr <alpha value> Compute method one correlation GRM using 
  	this alpha value. Default: -1
	 -method_two Computes correlation GRM using a second method 
	described below. Default: Disabled
   -king Computes Robust King GRM instead of using a correlation method.
	Default: Disabled
 -batch_size <batch size value> Number of loci computed at a single
	time per CPU thread. Default: 500
   -id_list <file w/ subject IDs> Specified file contains a list of 
		subject IDs separated by spaces.  The resulting GRM will 
	only use these IDs and excluded all others. Default: All IDs are used
 -normalize When used during the creation of a correlation GRM the final values
	 	are normalized using the square roots of the diagonal values.  The result
	being that diagonal elements are 1 and off-diagonal elements are bounded
	by 1 and -1.  Z*_i_j = Z_i_j/sqrt(Z_i_i*Z_j_j) where Z* is the final value
	and Z is the unnormalized value, i refers to the index of subject i while
 	j refers to the index of subject j.

 Default method for GRM creation is correlation method one with default alpha = -1.
 
Correlation Method One:  
Z_i_j = 1/m * sum_k=0 to k=m (SNP_i_k - 2*f_k)*(SNP_j_k - 2*f_k)*(2*f_k*(1-f_k))**alpha

Z = GRM value

i = subject row index

j = subject column index

k = locus index

SNP_i_k = SNP value for subject i and locus k (must be 0,1,or 2)

f_k = major allele frequency of locus k

m = total number of loci not including loci that are missing values for 
  subjects i or j

      alpha = exponent for variance, best set to default value of -1 

Correlation Method Two:
	Z_i_j = 1/variance_sum * sum_k=0 to k=m (SNP_i_k - 2*f_k)*(SNP_j_k - 2*f_k)

Note: Variables are the same as above except for variance_sum

variance_sum = sum_k=0 to k=m 2*f_k*(1-f_k) 
Note: Locus index k is excluded if there is a missing SNP value for 
            subject i or j.  Therefore the variance_sum calculated 
            for one element of the GRM may differ from another element. 

Robust King Method:
Z_i_j = 1 - (sum_k=0 to k=m (SNP_i_k - SNP_j_k)**2)/(2*min(N_i,N_j))

N_i = number of heterozygotes for subject i
N_j = number of heterozygotes for subject j
Note: Heterozygotes for subject i or j are not included if the other
      subject is missing a value at that locus.

For information on robust king method see Manichaikul A, Mychaleckyj JC, 
Rich SS, Daly K, Sale M, Chen WM  
      (2010) Robust relationship inference in genome-wide association studies. 
 Bioinformatics 26(22):2867-287
       

Return to command index

A1.14 catenate


Purpose:  Concatenate strings

Usage:    catenate [<string>]*

Example:  set modelname [catenate $basename 0 .mod]

Return to command index

A1.15 change-notes


Purpose:  Display recent changes

Usage:    change-notes


Shortcuts: chang - change-notes

Return to command index

A1.16 chi


Purpose:  Compute probability for a chi-square value

Usage:    chi <value> <degrees>
          chi -number <value> <degrees>      ; return only the number
          chi -inverse <pvalue> <degrees>

Notes:    Without the -number argument, the result is returned as a string
          like this:

          p = 0.0012345

          (The sign will be "<" if below available accuracy.)

          With the -inverse argument, the chi-square value corresponding
          to a given p-value is returned. The -number argument does not
          apply when the inverse is computed and should not be given.

          chi will raise an error for certain out of bound conditions
          You may use a catch {} block to prevent this from ending scripts:

          set test [catch {set p [chi $val $deg]}]
          if {$test != 0} {set p 1.0}



Shortcuts: chi - chi

Return to command index

A1.17 chinc


Purpose:  Compute probability for a noncentral chi-square value

Usage:    chinc <value> <df> <lambda>

Return to command index

A1.18 chromosome


Purpose:  Select chromosome(s) for multipoint scan

Usage:   chromosome [<number>|<name>|<low>-<high>|all|*]+    ;select
         chromosome                  ; show currently selected chromosomes
         chromosome show             ; show all available chromosomes
         chromosome showm            ; show mibd's in pass (see note 2)

Examples: 
         chromosome 10
         chromosome 10-13 15-17 20
         chromosome 11 11p
         chromosome all              ; select all available chromosomes
         chromosome *                ; select all available chromosomes

Notes:   Use in conjunction with mibddir, interval, multipoint commands.

 (2)  The showm option lists the mibds's that will be selected by
      the current "chromosome" and "interval" commands.

 (3)  Alphanumeric chromosomes may not be in <low>-<high> ranges, but may
      be selected individually (for example, 11p), or with "all" or *.

 (4)  The chromosome specification is not saved from one solar session
      to the next unless put in a .solar file.

 (5)  For convenience, you may specify a chromosome or range of
      chromosomes whose mibds are not actually present, and
      the gap is ignored silently, as long as there are some mibds
      available for other specified chromosomes.  The chromosome
      command acts as a filter applied to the mibd data actually
      available.

Shortcuts: chro - chromosomes

Return to command index

A1.19 clod


solar::lod --

Purpose:  Calculate LOD score

Usage:    lod [<test-loglike> <null-loglike>] [<options>]
          options := [-auto|-off|-raw] [-trait <N>] [-rhoq <N>] [-v]
                     [-1t|-2t|-3t|-4t|-t1|-t2|-t3|-t4] [-nolodadj]

          If no likelihoods are specified, the likelihoods of the
          "current" model and the applicable "null" model are used.

          -auto        Convert multivariate LOD to 1df effective LODs based
                         on number of traits in current model and constraint
                         of relevant rhoq's (default)
          -off         Do not convert LODs to 1df effective
          -raw         Do not perform LOD conversion or lodadj
          -traits <N>  Convert multivariate LOD to 1dF assuming <N> traits
          -1t or -t1   Assume 1 trait (same as "-traits 1")
          -2t or -t2   Assume 2 traits (same as "-traits 2")
          -3t or -t3   Assume 3 traits (same as "-traits 3")
          -4t or -t4   Assume 4 traits (same as "-traits 4")
          -rhoq <N>    Convert multivariate LOD to 1df assuming <N>
                         constraints of relevant rhoq's
          -nolodadj    Do not perform lod adjustment (lodadj)
          -v           verbose: Show adjustment and conversion steps

Examples: outdir test1
          load model test1/null1
          lod
          lod -v
          lod -2000.51 -2030.87
          lod -trait 3 -rhoq 1 -v -2000 -2030
          lod -raw -2000 -2030

Notes:    If no likelihoods are specified, the current model must have
          been maximized through a command such as "maximize," "twopoint",
          or "multipoint", and the applicable null model should be saved as
          nullX.mod (e.g. null0.mod, null1.mod) where X is the number
          of active linkage elements, which is assumed to be one less
          linkage element than in the current model.  Linkage elements are
          parameters named h2q1, h2q2, etc.  The null model must have
          been saved in the maximization output directory, either named
          after the trait or set by the outdir command.

          By default, SOLAR provides easily interpreted "1 df effective" LODs
          which are equivalent to those in univariate models.
          However, you can also have complete control over the LOD
          conversion performed either using arguments here or
          preferences set globally with the lodp command.  Options
          specified here override the defaults and lodp preferences.

          The correction of 2 trait LODs to 1dF effective LODs is based
          on this formula: the LOD is converted to chi square with
          1/2 1df, 1/4 3df, and 1/4 point mass at zero.  If rhoq is
          constrained, the formula is 1/2 1df, 1/4 2df, and 1/4
          point mass at zero.  This is then converted to a 1/2 1df
          chi square of equivalent p-value, which is divided by
          2ln10 to get the 1df effective lod score.

          The correction of 3 trait LODs to 1dF effective LODs is based
          on the formula: the LOD is converted to chi square with
          3/8 1df, 3/8 3df, 1/8 6df, and 1/8 point mass at zero.
          For each rhoq constrained, the 6df is changed downward
          by 1df.

          The conversion of higher multivariate LODs follows a similar
          expanding sum.  If you wish to see the weights used, use the
          lod command with the -v option.

          Empirical LOD adjustment, if any, is automatically applied (see
          the lodadj command) unless the -raw option is used.  Unless you
          specify -raw, SOLAR will need to search the output directory for
          a lodadj.info file, which means that a trait or outdir must
          have been selected.

          Empirical LOD adjustment is not yet supported for bivariate
          models.  The lodadj value is ignored when bivariate LODs are
          computed, and, in the cases where the lodadj value would be
          shown (such as in the multipoint.out file, or if lod is called
          from the command prompt) a warning message is shown instead.

          In SOLAR version 3.0.2, the "clod" and "lod" commands were
          combined into a new "lod" command.  The options allowed
          have changed compared with the earlier "clod" ; the original
          "lod" command did not allow any arguments.

          Use the "lodn" command if you the current model may not use
          the "h2q1" linkage parameter and you are not specifying
          loglikelihoods explicitly.
          
          See also lodn, lodp, lodadj.

Shortcuts: clod - clod

Return to command index

A1.20 combinations


Purpose:  Make a list or count combinations of integers 1..N of size K

Usage:    combinations <N> [<K>] [-max <maxsize>] [-list list] [-force]
                       [-count] [-counts] [-start <number>] [-stop <number>]

          N        defines the range of integers 1..N of interest.  If no
                   other arguments are specified, and N <= 10, the set of
                   all combinations of this range of integers is returned.
                   To get a list of combinations where possibly N > 10, 
                   add either the -list or -force option, with -list being
                   the preferred method.

          K        only include combinations of exactly this size (as
                   in traditional "combinations").  If this argument is
                   not specified, the default is to include combinations
                   of all sizes, starting from the smallest size.

          -count   Only return the NUMBER of combinations, not a list
                   of the actual combinations.

          -counts   Return a list containing the number of combinations for
                    each "size" (i.e. "K").

          -max     include all combinations up to and including this size
                   (the default is to include combinations of all sizes).
                   The K and -max arguments may not be used at the
                   same time.

          -list    APPEND combinations to this list rather than returning
                   them.  Specify the list variable by name, as with the
                   Tcl lappend command (see example below).  If the variable
                   is not already set, a new variable is created.  When this
                   argument is used, nothing is returned.  For example:

                       set comblist {}
                       combinations 20 -max 10 -list comblist

                   Be sure to empty the list first (as shown above) if you
                   do not want to append to the previous contents, if the
                   variable was used previously in the same procedure.  This
                   option may save memory (as compared with -force) for
                   very large N since only one copy of the list is ever
                   created.
                 
          -force   return list ("by value") even if N > 10.  This
                   is required for N > 10 unless the -list, -count, -counts,
                   -start, or -stop arguments are given.  Only use this
                   option if you are sure this is what you want to do.
                   Read all the following paragraphs to be sure.  Generally,
                   you would only use it inside a script, where the
                   returned combinations are going to be immediately saved
                   to a variable, such as:

                       catch {set comblist [combinations $N -force]}

                   The reason to require a -force option is that if a
                   large N is given in an interactive session, the
                   terminal window could be locked up for hours displaying
                   all the combinations, with no way to break out until
                   the terminal output buffer is empty.  If that were to
                   happen, you would probably want to kill the whole
                   terminal session from another terminal window.  For
                   some users, that would probably require calling the
                   system administrator.

                   The -force option may require more memory than the -list
                   option because a copy of the list is created in the
                   process of "returning" it to the caller; that's just
                   the way Tcl works, and it becomes important when creating
                   lists with huge numbers of elements.

                   If you are using this form of the command in a script,
                   be careful that it is not the last command in the
                   script, which Tcl automatically returns.  Then, if
                   the user runs the script from the terminal, the
                   terminal window would be locked up.  If you must
                   use it as the last command in a script, you should
                   use a "catch" command around it, as in the example
                   above.  The catch command only returns 0 (for success)
                   or 1 (for error).

          The following options are useful when dividing up the set of
          combinations into jobs of an equal size.  Otherwise, they may
          seem a bit obscure.

          -start   Start with combination number <number>

          -stop    Stop with combination number <number>

Notes:

CAUTION!  The list can get VERY BIG!  Be careful if n > 15 because
memory requirements double for each [incr n], unless you are setting k
or -max.  ("BIG" means 100's of megabytes, gigabytes, etc.  I am not
kidding. On Solaris systems, you can use the SOLAR "memory" command to see
just how much memory SOLAR has consumed.)


Shortcuts: comb - combinations

Return to command index

A1.21 constraint


Purpose:  Create, list, or delete constraints

Usage:    constraint term [+ <term>...] = <number> | <term>
            <term> is [<factor>*]<parameter>
          constraint                            ; display all constraints
          constraint delete <number>            ; delete constraint by number
          constraint delete <spec>|<left-spec>  ; delete specified constraint
          constraint delete_all                 ; delete all constraints

Example:  constraint e2 + h2r = 1
          constraint bq1 = bq2
          constraint delete bq1
          constraint delete h2r + e2 = 1
          constraint H2 + 3*H2q1 - 2*H2q2 = 5*E2  ; anything is possible

Notes:    (1) The constraint numbers are shown when listing constraints.

          (2) If a new constraint matches the right hand "body" of an
              existing constraint, that existing constraint is replaced by
              the new constraint (the old one would be invalid anyway).

              solar> constraint sd = 1
              solar> constraint
              [1] sd = 1
              solar> constraint sd = 0
              solar> constraint
              [1] sd = 0

          (3) For the "constraint delete <left-spec>" command, if there
              is a constraint matching the entire left specification of a
              constraint, it is deleted.  Or, you can specify the entire
              specification to deleted.  (The "constraint delete <parameter>"
              version of the constraint command was ambiguous and is now
              obsolete.)

              [1] e2 + h2r + h2q1 = 1
              [2] h2q1 = 0
              constraint delete h2q1              ;# deletes constraint [2]
              constraint delete h2q1 = 0          ;# deletes constraint [2]
              constraint delete e2 + h2r + h2q1   ;# deletes constraint [1]

          (4) Instead of constraining covariate beta values to 0, use
              the "covariate suspend" command instead as this permits
              greater efficiency.

          (5) If you need to constrain interaction covariates (e.g. age*sex)
              or parameters whose name begins with a number, or parameters
              whose name includes other special characters, enclose
              the parameter name in angle brackets <>.  When deleting the
              constraint, angle brackets are optional around the parameter
              name.  Do not include numeric factors in the delete command.

              constraint 3*<bage*sex> = 1
              constraint delete bage*sex

         (6)  Constaints may only be simple linear equations of terms
              which include a optional numeric factor and a parameter name.
              Operating exponents and functions are not supported.
              If you need to constrain a power of some model feature,
              consider making the parameter itself contain the required
              power, then it can be linearly constrained.

         (7)  Numeric constants (such as 1 or 0) should only appear as
              the right hand term.


Shortcuts: cons - constraints

Return to command index

A1.22 copybin


Purpose:  Install new executable file without disturbing current users

Usage:    copybin <filename> <directory>

Note:    The way this works is quite simple.  The original version of the
         file is not overwritten or deleted, but instead renamed.  
         Running processes continue to access the original version
         through the inode, regardless of the name change, while new
         processes will access the new version.  The renaming scheme
         simply appends dot followed by a number to the filename.
         The first available number starting from 1 is used.  For
         example, the old "solarmain" becomes "solarmain.1" or
         "solarmain.2" if a "solarmain.1" already exists, etc.  At some
         point you might want to clear out some of the older versions, but
         that is up to you, and it would lead to numbering that is not
         sequential, since copybin always takes the first available
         number.

         This is similar in design to the Unix "install -f" command.
         It lacks some of install's checking features, but in one way
         is much more capable: it allows any number of new versions to
         be installed without disturbing users of the first or any other
         previous version.  The Unix install command only has one level
         of backup since it merely prepends "OLD" to the original name.
         If you do two install's in a row over a timespan in which jobs
         are continuing to run (as, unfortunately, is often required)
         copies of the original version are lost and users are likely
         to get a memory mapping error of some kind.

         This seems to work across NFS mounted filesystems, but it might
         not work for you, so be wary.  Actually, in ancient Unix days this
         command might not have been necessary, but now that memory mapping
         is used to load image files, it is necessary now.

Return to command index

A1.23 countfields


Purpose: determine consistency of number of columns in a comma delimited file

Usage: countfields <filename>

An information report is returned like this:

longest: 8 (#1) x 1497    shortest: 8 (#1) x 1497

This means that the longest record had 8 fields, the first such record was
#1, and it was followed by 1497 others of same length in the file.

As it happens, the shortest record also had 8 fields, it was #1, and followed
by 1497 of the same length in the file.

Return to command index

A1.24 covariate


Purpose:  Set up covariates (independent variables).
          It can handle interactions and polynomial terms.
          For other non-polynomial models, use the 'mu' command.

Usage:    covariate <variable>[^n | ^1,2[,3...]][*<variable> | #<variable>
                                                [([trait])]]*
                    Creates a new covariate.  See below for examples.
                                         ;
          covariate                      ; display all covariate info
          covariate delete <string>      ; deletes covariate and beta(s)
          covariate delete_all           ; deletes all covariates and beta(s)
                                         ;
          covariate <variable>()         ; Null Covariate: require var in
                                         ;   sample without covariation
                                         ;
                                         ; Covariate Suspension (for
                                         ;   temporary hypothesis testing).
          covariate suspend <string>     ; temporarily disable covariate
          covariate restore <string>     ; re-activate suspended covariate

Examples: 

  covariate age                       ; Simple covariate Age
  covariate age*sex                   ; Age by Sex interaction (only)
  covariate age*diabet*diameds        ; 3-way interaction
  covariate age^2                     ; Age squared as a simple covariate
  covariate age^1,2                   ; Shorthand for: age age^2
  covariate age#diabet                ; Shorthand for the following:
                                      ;   covariate age diabet age*diabet
  covariate age^1,2,3#sex             ; Shorthand for all the following:
      covariate sex age age*sex age^2 age^2*sex age^3 age^3*sex

  covariate sex age(q1) age*sex(q3)   ; Trait-specific Covariates:
                                      ;   covariate sex applied to all traits
                                      ;   covariate age applied to trait q1
                                      ;   covariate age*sex applied to q3

  In a multivariate analysis, trait-specific covariates are only required
  for the sample of their trait.  See note (7) below.

  covariate q2()                      ; Null-Covariate:
                                      ;   require q2 in sample of all traits

Notes:    (1) More than one covariate may be specified separated by spaces.
              Also more than one covariate command may be used.  Adding a
              new covariate does not remove previous ones.  Spaces are
              not allowd within the specification of each covariate term.
          (2) Pound (#) and comma (,) are shorthands allowed ONLY if there
              are no more than two variables.  Further, only the first
              variable may have multiple exponents separated by commas.
              The following are INVALID:

                  covariate age^1,2*diabed*diamed   ;# INVALID
                  covariate age#diabet*diamed       ;# INVALID
                  covariate weight*height^1,2       ;# INVALID

              Instead, specify multiple covariates as required:

                  covariate age*diabet*diamed age^2*diabet*diamed
                  covariate age diabet diamed age*diabet age*diamed 
                  covariate diabet*diamed age*diabet*diamed
                  covaraite height^1,2*weight

          (3) N-way interactions are possible to any N.

          (4) Covariate commands create beta parameters automatically.
              Beta names begin with 'b' followed by the variables
              and exponents as in the non-shorthand form (e.g. bage^2*sex).

          (5) Quantitative variables are mean-adjusted.  Binary variables
              are adjusted so that the lowest value is 0 and the highest
              value is 1.

          (6) Suspended covariate variables are still required in sample.

          (7) In a bivariate analysis "unqualified" covariates are applied
              to and required by all traits, and trait-specific covariates
              (with parenthetically named trait) apply only to the named
              trait and are only required for that trait.  (This was
              changed in SOLAR version 4.0.3.)

              In a univariate analysis, ALL covariates are are currently
              required regardless of whether they apply to the trait.
              (The requirement of covariates specific to a different trait
              may be removed in a future update.)

              Null covariates (such as ef() ) are not applied to any
              trait, but are required by all traits.

              Variables not declared as covariates, but used in the mu
              equation, are (at this time) required by all traits.

          (8) When a trait is changed, covariate beta parameters are
              and reset to force re-evaluation of boundaries
              on the next maximization.  But trait changes
              are not permitted for bivariate models; "model new" is
              required.  "model new" will remove all covariates.

          (9) When created, beta parameters have starting value 0.0 and
              no boundaries.  Likely boundaries are set automatically
              during the first maximization, and expanded if necessary up
              to an internally determined maximum (and you may further
              expand them if need be but this is highly unlikely).
              If desired, you may set boundaries after beta values have
              been created, and this will bypass automatic boundary
              setting (but not automatic boundary expansion).


Shortcuts: cov - covariates

Return to command index

A1.25 create_evd_data


Purpose: create_evd_data performs an EVD on the loaded pedigree for gwas or. 
gpu_gwas commands. This is useful for a data set with a large number of subjects.

Usage: create_evd_data --o <output base filename> --plink <plink set base filename> --use_covs

Prior to running the command select the trait that you plan to run gwas, gpu_gwas, or gpu_fphi 
with the trait command. The --plink option specifies a plink data set that will determine which 
ids will be used in running the EVD. The --use_covs options will include the ID set of covariates 
specified through the covariate command.  This command now allows you to enter two traits at once
in order to get their corresponding ID set.    
Output consists of three files labeled:
    <output base filename>.ids --list of subject ids
      <output base filename>.eigenvalues --list of eigenvalues
      <output base filename>.eigenvectors --list of eigenvectors
      <output base filename>.notes --notes on the creation of the EVD data set	

Return to command index

A1.26 create_fake_pedigree


Purpose: This command creates a pedigree file given a phenotype file taken as input. 

Usage: create_fake_pedigree <phenotype filename> [-o output pedigree filename
      <phenotype filename> Phenotype filename to be used to create pedigree
[-o <output pedigree filename>] Option to name output pedigree filename

Return to command index

A1.27 d2e


Purpose: convert Fortran D style exponents to E form

Usage: d2e <inputfilename> <outputfilename>
       d2e2 <inputfilename> <outputfilename>

d2e2 starts with line 2, so as not to disturb D's in the header line of
comma delimited files.  d2e is more suitable for pedsys files.

SOLAR now understands Fortran D style in phenotypes files in most
circumstances anyway, so this conversion is not generally needed.  You
will know you need this if you see error messages.

Return to command index

A1.28 d2e2


solar::d2e --

Purpose: convert Fortran D style exponents to E form

Usage: d2e <inputfilename> <outputfilename>
       d2e2 <inputfilename> <outputfilename>

d2e2 starts with line 2, so as not to disturb D's in the header line of
comma delimited files.  d2e is more suitable for pedsys files.

SOLAR now understands Fortran D style in phenotypes files in most
circumstances anyway, so this conversion is not generally needed.  You
will know you need this if you see error messages.

Return to command index

A1.29 define


Purpose:  Define an expression to be used in place of a trait or covariate

Usage:    define <name> = <expression>  ; create a definition
          trait <name> [,<name>]+       ; use definition as trait(s)

          define                        ; show all defininitions
          define <name>                 ; show definition for <name>
          define delete <name>          ; delete define for name
          define new                    ; delete all expressons
          define delete_all             ; delete all expressons
          define rename <name1> <name2> ; rename define name
          define names                  ; return list of all names

          <name> can be any alphanumeric string with underscore, do not
          use these reserved words as names:

              delete delete_all rename names new

          <expression> is formatted algebraically using standard
          math operators + - * / and ^ (power) and () parentheses, and
          also all math functions defined by the C Programming Language
          which includes "log" for natural logarithm, trig functions,
          and hyperbolic functions, among others.  Here is a list:
          erfc, erf, lgamma, gamma, j1, j0, y1, y0, rint, floor, ceil, 
          tanh, cosh, sinh, atan, acos, asin, tan, cos, sin, expm1, exp,
          logb, log1p, log10, log, cbrt, sqrt, and abs.  In addition,
          the inverse normal transformation (see help for "inormal") may
          be applied using the "inormal_" prefix (for example,
          inormal_q4 for trait q4).  "inormal_" may be abbreviated
          down to "inor_".  

          If a phenotype name within the expression contains special
          characters (anything other than letters, numbers, and underscore)
          it should be enclosed in angle brackets <>, and the angle brackets
          must also include any special operator prefix such as "inorm_".
          For example, given a trait named q.4 (with a dot), you could
          have a define command like this:

              define i4 = <inorm_q.4>

          Note: similar rules apply within the constraint and omega commands
          because those commands also allow expressions that could contain
          decimal constant terms and math operators.

          A debugging function named "print" is also available which
          prints and return the value of the expression it encloses.
          After printing, it pauses until the RETURN key is pressed.
          RETURN may be held down to pass through a lot of prints.
          Examples of the print command are given in the documentation
          for the "omega" command.

          The following relational operators may also be used between
          any two terms.  If the relation is true, 1 is returned,
          otherwise 0 is returned.  This enables you to construct
          compound conditional expressions having the same effect as
          could have been done with "if" statements.  The C operators
          < and > have been replaced with << and >> so as not to be
          confused with the <> quotation of variable names in SOLAR.

          C Format    Fortran Format    Test
          --------    --------------    ----

          ==          .eq.              if equal
          !=          .ne.              if not equal
          >=          .ge.              if greather than or equal
          <=          .le.              if less than or equal
          >>          .gt.              if greater than
          <<          .lt.              if less than


          An expression is understood to be quantitative unless the
          top level operator is a relational operator, in which case
          it is understood to be discrete.

          Names used must not match the names of any phenotype.  When
          there is an unintended match, the definition can not be used for
          trait names since it would be ambiguous.

          Once a valid definition has been created, it can be used in
          the trait command.  Any or all of the traits can be definitions.
          All definitions will be saved in the model file, and loaded back
          in when that model is reloaded.  Definitions in a model file
          will override current definitions.  It is possible to save a
          model with nothing but definitions if desired.  The only
          way to delete definitions is with the "new" "delete" or
          "delete_all" options, or by restarting SOLAR.  The "model new"
          command has no effect on definitions.

          Expression names are not case sensitive.  Assigning a new
          expression to a name replaces the expression previously
          assigned to that name, even if it differs in letter case.
          Renaming a definition to a name differing only in letter
          case is possible.

          For covariates only, it is possible to include in definition a
          constant called "blank".  If an evaluation of the expression
          returns blank, that individual is counted as missing from the
          sample.  The best way to use this constant is with one or
          more conditionals like this:

          define sample = blank*(age<<22)*(sex==2)
          covariate sample()

          This blanks any male (sex==2) having age less than 22.
          blank is the number -1e-20, so any numerical operation may
          change it to a non-blank small number.  It should only be
          multiplied by 0 or 1.  The empty parentheses after sample() mean
          that it is not a maximized parameter, it is a null covariate
          only used to delimit the sample.

Examples:

          define loga = log(a)
          define eq1 = (q1 - 3.1)^2
          define dq4 = q4 .gt. 12

Return to command index

A1.30 deputy


Purpose:  Make limited user key (for deputy registrars)

Usage::   deputy register <deputy-key>
          deputy make <access-code> <username>

Notes:

1) Deputy registrar must obtain deputy-key and access-code from
   solar@txbiomedgenetics.org.  Key is granted for critical collaborators
   only for use in cluster systems where normal registration process
   is unwieldy.

2) Deputy registrar uses "deputy register" command to register as
   deputy.  This creates a file named .solar_deputy in home directory.
   (Note: It does not move the .solar_deputy file to SOLAR_DEPUTY_HOME
   if that is different from the deputy's home directory.)

3) The .solar_deputy file must be copied to a user to a deputy
   directory on all systems.  This can be done in one of two ways.
   The default way is to access the .solar_deputy file in the home
   directory of the deputy, which must be found in a pathname
   with the deputy's username replacing the current username.  For
   example if deputy registrar jsmith has registered the name pmiller,
   and the home directory for pmiller is:

       /home/pmiller

   Then the .solar_deputy file must be found in directory named:

       /home/jsmith

   If this default method cannot be used, there is an alternate
   method involving creating a shell variable SOLAR_DEPUTY_HOME
   giving the path to the .solar_deputy file.  For example, the
   following line could be added to the "solar" startup script:

       export SOLAR_DEPUTY_HOME=/home/admin/jsmith
   
4) The deputy registrar can now make a limited range key for each
   user using the "deputy make" command.  The user uses the
   normal "register" command to install the key into a file named
   .solar_reg in the user's home directory.  The .solar_reg file
   AND the .solar_deputy file (located as described in note 3)
   must be found on each system where SOLAR is to be run because
   both are used in the validation process for keys created by
   deputy registrars.

5) The "deputy make" command adds the usernames registered to a file
   named "solar_registrations" in your home directory.  The contents
   of this file should be sent to solar@txbiomedgenetics.org on at least
   a biannual basis.

6) Username must be 2 characters or longer.

Return to command index

A1.31 discrete-notes


Purpose:  Describe support for discrete traits

Usage:    discrete-notes

Discrete traits are detected automatically by SOLAR.  They must be
coded as two integer values separated by exactly 1.  Typical codings
are 0,1 or 1,2.  If you specify two values that are not separated by
exactly 1, this will be detected as an error.  If you specify more than
two values, your trait will not be determined to be discrete.  For
this reason, DO NOT specify missing values with a third number. 
Missing values should always be coded as blank (" ") or null with no
number or character.  DO NOT use "0" to signify missing values.  See
toward the bottom of this note for advice regarding analyzing traits
with two values quantitatively.

Discrete traits having more than 2 values are not supported by SOLAR.

(This is also true for discrete phenotypic covariates: if discrete, they
 should not have more than 2 values.  If you have such data, they
 should be recoded into N-1 discrete binary covariates or recast into
 "household groups."  See the documentation for the "house" command.)

Models with discrete traits may be used with any command in SOLAR such as
polygenic, twopoint, multipoint, maximize, etc.  Sometimes the
information returned by SOLAR differs.  For example, while the
"polygenic" command normally returns "proportion of variance due to all
covariates" when used with a quantitative trait, it instead returns the
"Kullback-Leibler R-squared" when used with a discrete trait.  (For
technical reasons, the proportion of variance due to all covariates is
not available for discrete trait models in SOLAR.)

By examining the maximization output files you can determine unambiguously
whether discrete or quantitative methods were used.  (An example of
a maximization output file is "null0.out" saved in the maximization
output directory after running "polygenic".)  In this file, immediately
after the "Descriptive Statistics" and immediately before the "Model
Parameter Starting Points and Boundaries" there will be one of two
comments, either:

               Using SOLAR Quantitative Trait Modeling

or

                 Using SOLAR Discrete Trait Modeling

When a model with a discrete trait is maximized, special discrete trait
algorithms are used. Unfortunately, these methods are much more prone
to numerical problems than the usual "quantiative trait" methods.
Numerical problems lead to faulty parameter estimates and convergence
failures.

The default descrete method is relatively robust and only infrequently
has the problem where the heritability erroneously gets close to 1.0.

Even if the polygenic heritability (h2r) goes to 1.0, you may still be
able to run a "multipoint" linkage analysis to find important locii.
The heritibilities will be wrong, and the LOD scores will be wrong,
but the "peaks" may be at or near the correct locations.

It is not recommended to use the optional second discrete method set by
giving the command "option DiscreteMethod 2" prior to running
polygenic.  Although it was intended to be more accurate, it more
frequently fails with convergence errors or having the heritability go
to 1.0, and at this time it is not recommended.

Some people also try analyzing their discrete trait as quantitative.
This can be done by giving the command "option EnableDiscrete 0".
The likelihoods, LODS, and parameter estimates may be inaccurate, but the
LOD peaks should be in the correct places.  Better convergence is
sometimes obtained, however, than when using the discrete method.

Beware that there is a fundamental error when analyzing a discrete trait
as quantitative.  There are not truly two degrees of freedom for the
mean and SD.  Therefore, convergence failure is still more common with these
models than with true quantitative models.

Also beware that if you had previously analyzed the trait as discrete,
and then changed the EnableDiscrete option to 0 without exiting SOLAR
or giving the "model new" command, you will still have parameter SD
constrained to 1.0, which is probably NOT what you need to do.  SD is
properly constrained to 1.0 only when you are analyzing a discrete trait
as discrete (and, perhaps, in a few other esoteric cases).

Because of all the pitfalls in using discrete traits, we try to find and
use relevant quantitative traits whenever possible.


Shortcuts: discrete-note - discrete-notes

Return to command index

A1.32 doc


Purpose:  Find the SOLAR documentation

Usage:    doc [-whereis]

          doc               show official documentation URL and
                            location of documentation files on this system

Notes:    This command now tells you the URL where SOLAR may be seen with
          any available browser.  Previously, it would run Netscape.

Shortcuts: doc - doc

Return to command index

A1.33 dominance-notes


Purpose:  Find dominance documentation

Dominance analysis is documented in section 9.4 of the SOLAR
manual.  Use the "doc" command or point your browser to
the "doc" directory of your SOLAR directory, then select
"Go to full SOLAR manual", then select Chapter 9.

Dominance analysis is made possible by the "delta7" and "d7" columns
in SOLAR phi2.gz and ibd matrices.  For polygenic models, the delta7
matrix column is loaded, a d2r parameter is created and added to the
"e2" constraint, then a delta7*d2r term is added to the omega.  The
commands required to do this are described in Section 9.4

Return to command index

A1.34 doranch


Purpose:  execute a script on every ranch machine (usually for /tmp cleanup)

DO NOT USE THIS FOR SUBMISSION OF REGULAR JOBS because it bypasses
the Gridware queing system, which it must do for cleanup of ALL machines.

MUST BE RUN ON MEDUSA (only medusa addresses all other ranch machines)

See also "stepup -parclean" which uses doranch to cleanup junk created by
forcing a "stepup -par" job to quit.

Usage:    doranch <procname> <argument>

          doranch cleanuser <username>   ;# delete ALL user's /tmp files on
                                         ;# the ranch (Note: you can only
                                         ;# delete files for which you have
                                         ;# delete privilege, usually because
                                         ;# of owning them.)

          doranch finduser <username>    ;# find all my /tmp files on the
                                         ;# ranch but do not delete them.
                                         ;# Findings are written
                                         ;# to finduser.out.  If -all is
                                         ;# used, all users are shown.

          doranch cleantmp <dirname>.    ;# same as "stepup -parclean"
                                         ;# delete all /tmp/<dirname>.*
                                         ;# files.  (parallel stepup dirs
                                         ;# are prefixed with <dirname>
                                         ;# followed by dot.

          doranch findtmp <dirname>      ;# find all name* directories
                                         ;# but do not delete them.  Findings
                                         ;# are written to findtmp.out.

          doranch cleanme now            ;# same as
                                         ;# doranch cleantmp <username>

          make_rhosts                    ;# make a new .rhosts file, or
                                         ;# append to existing one to
                                         ;# make it complete.  It may be
                                         ;# useful to delete old .rhosts
                                         ;# file first if it contains errors.

          showspace                      ;# Return sorted list of /tmp
                                         ;# storage used by all users
                                         ;# in showspace.out.  Uses
                                         ;# doranch finduser -all, unless
                                         ;# existing finduser.out is found.


          <procname> is the name of the procedure to be run on every
                     machine.  procedures cleanuser, finduser, cleantmp,
                     findtmp, and cleanme are provided, but user-written
                     scripts could be used also.

          <username> is the username.

          cleantmp is a procedure that deletes all files and directories
          in /tmp which match the specified prefix, after which a wildcard
          * is assumed.  For example "cleantmp charlesp." would delete a
          directory named "/tmp/charlesp.11019.2"

Notes:  It is useful to run ranch jobs in subdirectories of the /tmp
        directory to minimize network traffic.  Jobs should be designed to
        cleanup after themselves in normal operation by deleting the
        /tmp subdirectory that was used as a working directory.

        However, even when jobs are designed to cleanup after themselves,
        if the jobs do not run to completion, the cleanup code might never
        be run.  This is especially true when a user or administrator
        shuts down a large array job (such as "stepup -par") because of
        a mistake or emergency.

        That is when "doranch" may be useful.  The "cleanuser" procedure
        deletes all files owned by the user in /tmp directories on
        all ranch machines.  The "cleantmp" procedure deletes all files
        and directories in /tmp prefixed by the cleantmp argument on all
        ranch machines.

        The doranch procedures listed above may be used in creating custom
        cleanup options for other scripts.

        Such an emergency cleanup option is already built into the stepup
        command as option "-parclean".  That uses doranch and cleantmp
        as shown above.  Authors of other parallel scripts for general
        create similar script options tailored to the names of /tmp
        subdirectories they use.

        To see what the "finduser" script looks like, in order to write
        something similar, use the solar command "showproc finduser".

        All the doranch procedures write to a file named by the specified
        procname, for example cleanuser writes to a file named cleanuser.out
        for each file found.  Usually this has two columns, node name
        and filename.  However, for "finduser" a middle column is added
        which lists total diskspace used in kbytes.

        Note that a valid .rhosts file is required for usage, and
        the make_rhosts file will make one.  doranch will complain
        if the .rhosts file is not present or incomplete.

        If doranch reports failure in connecting to some hosts, it is
        probably because the passwd and shadow files involved in userid
        authentication have not been properly updated on those hosts.

        If doranch reports failure in connecting to every ranch host, it
        probably means that the .rhosts file is invalid, and you should then
        delete the old .rhosts file and run make_rhosts.

        If doranch hangs at a particular host, that machine is probably
        down in some unusual way that is not known to gridware.
        

Return to command index

A1.35 drand


Purpose:  Return a random floating-point number between 0 and 1

Usage:    drand [ <seed> ]

          If no argument is given, drand returns a floating-point
          number in the interval [0,1].

          When an argument is given, it is taken to be an integer with
          which to seed the random number generator.  If a seed value
          of 0 is specified, the system time is used as the seed.  The
          random number generator should be seeded prior to its first
          use in a SOLAR run.  If the random number generator has not
          been seeded when it is first called, it will be seeded with
          the system time automatically.


Shortcuts: dran - drand

Return to command index

A1.36 empirical_pedigree


solar::epedigree --

Purpose: Load an empirical pedigree

Usage    load epedigree <epedigree> [-t <threshold>] [-founders] [-1]
         load pedigree <epedigree> [-t <threshold>] [-founders] [-1] [-sex <sexvarfile>]

          Empirical pedigrees are csv files which are representations of
          a kinship matrix, having kinship values for pairs of individuals.
          Empirical pedigrees may be loaded either with the regular
          "pedigree" keyword or "epedigree".  However if "load pedigree"
          is used, and there is an error, the error message may be more
          helpful if "load epedigree" is used..  Empirical pedigree
          files must have fields named IDA, IDB, and KIN specified
          in the header line to be recognized as an empirical pedigree.

         -t <threshold> sets a minimum kinship value that indicates
         a family relationship.  Levels lower than this are not
         recorded in the phi2.gz matrix produced, and do not
         indicate a family relationship between such individuals.
         The default threshold is zero, so any non-zero kinship
         value in the empirical pedigree file indicates a
         relationship.  Given some noise in the kinship estimation,
         this means that families could become too large for
         efficient processing.  In that case, a threshold value can
         help.  A possible value would be 0.05.  This will
         potentially lose information, however, and is only ever
         needed for greater efficiency or capacity in later processing.

         -1 (one family) simply bypasses the normal family determination
         and regards all the individuals present as one family.  This
         can save a small amount of time for loading the pedigree, but
         if multiple families would otherwise be identified, it will
         make the phi2.gz matrix file larger and subsequent processing
         much slower.

         -founders (all founders) regards all input individuals as
         separate families having no relationship to anyone else.

         -sex <sexvarfile> permits the loading of a sex variable for
         this pedigree from the named file, which should be a CSV file
         having fields ID and SEX.  Other fields are ignored.  The
         sex variable coding is the same as for normal pedigree files:
         M, m, or 1 for males, and F, f, or 2 for females.  Missing
         values are not allowed.  All individuals in the pedigree must
         be included.

         This command does not attempt to generate a reconstruction of
         a matching "real" pedigree.  Rather, it creates a skeleton
         pedindex sufficient for the correct operation of SOLAR.
         To do this, each individual in a kinship connected group
         (family) is made a sibling and listed together in the pedindex
         with an assigned family number.  This permits the correct one
         family at a time operation of SOLAR.

         This skeleton pedigree MUST always be used with the kinship
         matrix, phi2.gz, which is also generated by this command, which
         contains the original data reformatted for use with the pedindex.
         (The default mode of SOLAR in some cases is to use an on-the-fly
         generated kinship matrix determined by family relationships.
         However, when an empirical pedigree is loaded, this is suppressed
         as it is for discrete and evd models.)

         As with other pedigrees, once an epedigree is "loaded" in a
         working directory it remains loaded across multiple SOLAR sessions
         launched from that directory.

Return to command index

A1.37 empp


Purpose:  Calculate an empirical p-value from lodadj results

Usage:    empp <lod>

Return to command index

A1.38 epedigree


Purpose: Load an empirical pedigree

Usage    load epedigree <epedigree> [-t <threshold>] [-founders] [-1]
         load pedigree <epedigree> [-t <threshold>] [-founders] [-1] [-sex <sexvarfile>]

          Empirical pedigrees are csv files which are representations of
          a kinship matrix, having kinship values for pairs of individuals.
          Empirical pedigrees may be loaded either with the regular
          "pedigree" keyword or "epedigree".  However if "load pedigree"
          is used, and there is an error, the error message may be more
          helpful if "load epedigree" is used..  Empirical pedigree
          files must have fields named IDA, IDB, and KIN specified
          in the header line to be recognized as an empirical pedigree.

         -t <threshold> sets a minimum kinship value that indicates
         a family relationship.  Levels lower than this are not
         recorded in the phi2.gz matrix produced, and do not
         indicate a family relationship between such individuals.
         The default threshold is zero, so any non-zero kinship
         value in the empirical pedigree file indicates a
         relationship.  Given some noise in the kinship estimation,
         this means that families could become too large for
         efficient processing.  In that case, a threshold value can
         help.  A possible value would be 0.05.  This will
         potentially lose information, however, and is only ever
         needed for greater efficiency or capacity in later processing.

         -1 (one family) simply bypasses the normal family determination
         and regards all the individuals present as one family.  This
         can save a small amount of time for loading the pedigree, but
         if multiple families would otherwise be identified, it will
         make the phi2.gz matrix file larger and subsequent processing
         much slower.

         -founders (all founders) regards all input individuals as
         separate families having no relationship to anyone else.

         -sex <sexvarfile> permits the loading of a sex variable for
         this pedigree from the named file, which should be a CSV file
         having fields ID and SEX.  Other fields are ignored.  The
         sex variable coding is the same as for normal pedigree files:
         M, m, or 1 for males, and F, f, or 2 for females.  Missing
         values are not allowed.  All individuals in the pedigree must
         be included.

         This command does not attempt to generate a reconstruction of
         a matching "real" pedigree.  Rather, it creates a skeleton
         pedindex sufficient for the correct operation of SOLAR.
         To do this, each individual in a kinship connected group
         (family) is made a sibling and listed together in the pedindex
         with an assigned family number.  This permits the correct one
         family at a time operation of SOLAR.

         This skeleton pedigree MUST always be used with the kinship
         matrix, phi2.gz, which is also generated by this command, which
         contains the original data reformatted for use with the pedindex.
         (The default mode of SOLAR in some cases is to use an on-the-fly
         generated kinship matrix determined by family relationships.
         However, when an empirical pedigree is loaded, this is suppressed
         as it is for discrete and evd models.)

         As with other pedigrees, once an epedigree is "loaded" in a
         working directory it remains loaded across multiple SOLAR sessions
         launched from that directory.

Return to command index

A1.39 ephenotypes


Purpose:  Rewrite phenotypes file using IBDID's in pedindex.out

This has many uses, for example:
This way gaw10 example can be run as empirical pedigree based on phi2.gz
generated matrix as pedigree, and identical results obtained.  Other
pedigree/phenotypes files in original SOLAR formats could be processed
this way.  Also, a sex variable file can be made from the original pedigree
file.

Usage: ephenotypes <phenfile>

Example: (starting in solar:)
example
load ped gaw10.ped
ephenotypes gaw10.phen
mv ephenotypes.out gaw10.ephen
ephenotypes gaw10.ped
mv ephenotypes.out gaw10.sex

now make the empirical pedigree (in shell):
gunzip phi2.gz
echo "junk,ida,idb,kin" >phi2.csv
sed -E "s/[ ]+/,/g" <phi2 >phi2.sed
tail -n +2 phi2.sed >>phi2.csv
mv phi2.csv gaw10.eped

now use it (back in solar):
load ped gaw10.eped -sex gaw10.sex
load phen gaw10.ephen
model new
trait q4
covar age sex
polygenic -s

Return to command index

A1.40 epistasis


Purpose:  Use command:  multipoint -epistasis <N>

Usage:    multipoint -epistasis <N>  (<N> is the mibd index of interest)

The 'epistasis' command itself is reserved for future use.

Shortcuts: epista - epistasis

Return to command index

A1.41 evdinev


Purpose: I/O for EVD data

Usage:   trait ...
         covar ...

         evdout [<-evectors>[<-all>]] ;# write evddata.out,evectors if asked

         evdinx  [<evdfile>] ;# return X* matrix from evddata.out file
         evdiny  [<evdfile>] ;# return Y* matrix from evddata.out file
         evdinz  [<evdfile>] ;# return Z* matrix from evddata.out file

         evdinev [<-all>]  ;# load eigenvector matrix(es) from file(s) saved
                           ;# by evdout -evectors

Note: You must select trait, covariates, and anything else that would
      restrict the sample size before invoking evdout.  You
      do not need a maximized model, just trait and covariates.

evdout writes out evd transformed variables and the eigenvalues
      (which are called lambda) to a file named evddata.out in the
      maximization output directory but without actually doing a
      model maximization.  The trait value, which is not demeaned,
      is written to variable <traitname>_evd.  The covariate values,
      which are demeaned or scaled appropriately, are written to
      variables named evd2_<varname>_evd.  Ignore other fields.

      Additionally, you can write out the eigenvectors, either in
      normal per-pedigree mode (one matrix for each family) or in
      entire-sample mode if you select the -all option.  Filenames are
      evectors.family<n>.mat.csv for per-family matrices and
      evectors.mat.csv for entire sample (-all) matrix.  Matrix files
      are written to the outdir.

      Current methods rely on the EVD transformation of variables
      and do not require the Eigenvectors to be output, so generally
      speaking you should not use the -evectors option unless you
      know you need it.

evdinx returns the X* matrix corresponding to current evddata.out.
      The X* matrix has from left to right: (1) a column of 1's, (2+) one or
      more columns of EVD transformed covariates which have been scaled to
      mean of zero, in the order in which covariates occur in the model.
      evdinx has a -method2 option needed for the -method2 option of fphi.

evdiny returns the Y* matrix, which is a one vertical column matrix
      (aka vector) of EVD transformed trait values.

evdinz returns the Z matrix, which has has a column of 1's and a column
       of lambda's.

evdinev returns a list of eigenvector matrices, one for each family,
      or if the -all option is specified, just one matrix for the entire
      sample.  In order to use the -all option, you must have previously
      done evdout with the -evectors and -all options.

Example:

      foreach covar $testvars {
          model new
          trait q4
          covar $covar
          evdout
          set X [evdinx]
          set Y [evdiny]
          solve $X $Y
      }

Return to command index

A1.42 evdinx


solar::evdiny --
solar::evdinz --
solar::evdinev --

Purpose: I/O for EVD data

Usage:   trait ...
         covar ...

         evdout [<-evectors>[<-all>]] ;# write evddata.out,evectors if asked

         evdinx  [<evdfile>] ;# return X* matrix from evddata.out file
         evdiny  [<evdfile>] ;# return Y* matrix from evddata.out file
         evdinz  [<evdfile>] ;# return Z* matrix from evddata.out file

         evdinev [<-all>]  ;# load eigenvector matrix(es) from file(s) saved
                           ;# by evdout -evectors

Note: You must select trait, covariates, and anything else that would
      restrict the sample size before invoking evdout.  You
      do not need a maximized model, just trait and covariates.

evdout writes out evd transformed variables and the eigenvalues
      (which are called lambda) to a file named evddata.out in the
      maximization output directory but without actually doing a
      model maximization.  The trait value, which is not demeaned,
      is written to variable <traitname>_evd.  The covariate values,
      which are demeaned or scaled appropriately, are written to
      variables named evd2_<varname>_evd.  Ignore other fields.

      Additionally, you can write out the eigenvectors, either in
      normal per-pedigree mode (one matrix for each family) or in
      entire-sample mode if you select the -all option.  Filenames are
      evectors.family<n>.mat.csv for per-family matrices and
      evectors.mat.csv for entire sample (-all) matrix.  Matrix files
      are written to the outdir.

      Current methods rely on the EVD transformation of variables
      and do not require the Eigenvectors to be output, so generally
      speaking you should not use the -evectors option unless you
      know you need it.

evdinx returns the X* matrix corresponding to current evddata.out.
      The X* matrix has from left to right: (1) a column of 1's, (2+) one or
      more columns of EVD transformed covariates which have been scaled to
      mean of zero, in the order in which covariates occur in the model.
      evdinx has a -method2 option needed for the -method2 option of fphi.

evdiny returns the Y* matrix, which is a one vertical column matrix
      (aka vector) of EVD transformed trait values.

evdinz returns the Z matrix, which has has a column of 1's and a column
       of lambda's.

evdinev returns a list of eigenvector matrices, one for each family,
      or if the -all option is specified, just one matrix for the entire
      sample.  In order to use the -all option, you must have previously
      done evdout with the -evectors and -all options.

Example:

      foreach covar $testvars {
          model new
          trait q4
          covar $covar
          evdout
          set X [evdinx]
          set Y [evdiny]
          solve $X $Y
      }

Return to command index

A1.43 evdiny


solar::evdinz --
solar::evdinev --

Purpose: I/O for EVD data

Usage:   trait ...
         covar ...

         evdout [<-evectors>[<-all>]] ;# write evddata.out,evectors if asked

         evdinx  [<evdfile>] ;# return X* matrix from evddata.out file
         evdiny  [<evdfile>] ;# return Y* matrix from evddata.out file
         evdinz  [<evdfile>] ;# return Z* matrix from evddata.out file

         evdinev [<-all>]  ;# load eigenvector matrix(es) from file(s) saved
                           ;# by evdout -evectors

Note: You must select trait, covariates, and anything else that would
      restrict the sample size before invoking evdout.  You
      do not need a maximized model, just trait and covariates.

evdout writes out evd transformed variables and the eigenvalues
      (which are called lambda) to a file named evddata.out in the
      maximization output directory but without actually doing a
      model maximization.  The trait value, which is not demeaned,
      is written to variable <traitname>_evd.  The covariate values,
      which are demeaned or scaled appropriately, are written to
      variables named evd2_<varname>_evd.  Ignore other fields.

      Additionally, you can write out the eigenvectors, either in
      normal per-pedigree mode (one matrix for each family) or in
      entire-sample mode if you select the -all option.  Filenames are
      evectors.family<n>.mat.csv for per-family matrices and
      evectors.mat.csv for entire sample (-all) matrix.  Matrix files
      are written to the outdir.

      Current methods rely on the EVD transformation of variables
      and do not require the Eigenvectors to be output, so generally
      speaking you should not use the -evectors option unless you
      know you need it.

evdinx returns the X* matrix corresponding to current evddata.out.
      The X* matrix has from left to right: (1) a column of 1's, (2+) one or
      more columns of EVD transformed covariates which have been scaled to
      mean of zero, in the order in which covariates occur in the model.
      evdinx has a -method2 option needed for the -method2 option of fphi.

evdiny returns the Y* matrix, which is a one vertical column matrix
      (aka vector) of EVD transformed trait values.

evdinz returns the Z matrix, which has has a column of 1's and a column
       of lambda's.

evdinev returns a list of eigenvector matrices, one for each family,
      or if the -all option is specified, just one matrix for the entire
      sample.  In order to use the -all option, you must have previously
      done evdout with the -evectors and -all options.

Example:

      foreach covar $testvars {
          model new
          trait q4
          covar $covar
          evdout
          set X [evdinx]
          set Y [evdiny]
          solve $X $Y
      }

Return to command index

A1.44 evdinz


solar::evdinev --

Purpose: I/O for EVD data

Usage:   trait ...
         covar ...

         evdout [<-evectors>[<-all>]] ;# write evddata.out,evectors if asked

         evdinx  [<evdfile>] ;# return X* matrix from evddata.out file
         evdiny  [<evdfile>] ;# return Y* matrix from evddata.out file
         evdinz  [<evdfile>] ;# return Z* matrix from evddata.out file

         evdinev [<-all>]  ;# load eigenvector matrix(es) from file(s) saved
                           ;# by evdout -evectors

Note: You must select trait, covariates, and anything else that would
      restrict the sample size before invoking evdout.  You
      do not need a maximized model, just trait and covariates.

evdout writes out evd transformed variables and the eigenvalues
      (which are called lambda) to a file named evddata.out in the
      maximization output directory but without actually doing a
      model maximization.  The trait value, which is not demeaned,
      is written to variable <traitname>_evd.  The covariate values,
      which are demeaned or scaled appropriately, are written to
      variables named evd2_<varname>_evd.  Ignore other fields.

      Additionally, you can write out the eigenvectors, either in
      normal per-pedigree mode (one matrix for each family) or in
      entire-sample mode if you select the -all option.  Filenames are
      evectors.family<n>.mat.csv for per-family matrices and
      evectors.mat.csv for entire sample (-all) matrix.  Matrix files
      are written to the outdir.

      Current methods rely on the EVD transformation of variables
      and do not require the Eigenvectors to be output, so generally
      speaking you should not use the -evectors option unless you
      know you need it.

evdinx returns the X* matrix corresponding to current evddata.out.
      The X* matrix has from left to right: (1) a column of 1's, (2+) one or
      more columns of EVD transformed covariates which have been scaled to
      mean of zero, in the order in which covariates occur in the model.
      evdinx has a -method2 option needed for the -method2 option of fphi.

evdiny returns the Y* matrix, which is a one vertical column matrix
      (aka vector) of EVD transformed trait values.

evdinz returns the Z matrix, which has has a column of 1's and a column
       of lambda's.

evdinev returns a list of eigenvector matrices, one for each family,
      or if the -all option is specified, just one matrix for the entire
      sample.  In order to use the -all option, you must have previously
      done evdout with the -evectors and -all options.

Example:

      foreach covar $testvars {
          model new
          trait q4
          covar $covar
          evdout
          set X [evdinx]
          set Y [evdiny]
          solve $X $Y
      }

Return to command index

A1.45 evdmat


solar::evdinx --
solar::evdiny --
solar::evdinz --
solar::evdinev --

Purpose: I/O for EVD data

Usage:   trait ...
         covar ...

         evdout [<-evectors>[<-all>]] ;# write evddata.out,evectors if asked

         evdinx  [<evdfile>] ;# return X* matrix from evddata.out file
         evdiny  [<evdfile>] ;# return Y* matrix from evddata.out file
         evdinz  [<evdfile>] ;# return Z* matrix from evddata.out file

         evdinev [<-all>]  ;# load eigenvector matrix(es) from file(s) saved
                           ;# by evdout -evectors

Note: You must select trait, covariates, and anything else that would
      restrict the sample size before invoking evdout.  You
      do not need a maximized model, just trait and covariates.

evdout writes out evd transformed variables and the eigenvalues
      (which are called lambda) to a file named evddata.out in the
      maximization output directory but without actually doing a
      model maximization.  The trait value, which is not demeaned,
      is written to variable <traitname>_evd.  The covariate values,
      which are demeaned or scaled appropriately, are written to
      variables named evd2_<varname>_evd.  Ignore other fields.

      Additionally, you can write out the eigenvectors, either in
      normal per-pedigree mode (one matrix for each family) or in
      entire-sample mode if you select the -all option.  Filenames are
      evectors.family<n>.mat.csv for per-family matrices and
      evectors.mat.csv for entire sample (-all) matrix.  Matrix files
      are written to the outdir.

      Current methods rely on the EVD transformation of variables
      and do not require the Eigenvectors to be output, so generally
      speaking you should not use the -evectors option unless you
      know you need it.

evdinx returns the X* matrix corresponding to current evddata.out.
      The X* matrix has from left to right: (1) a column of 1's, (2+) one or
      more columns of EVD transformed covariates which have been scaled to
      mean of zero, in the order in which covariates occur in the model.
      evdinx has a -method2 option needed for the -method2 option of fphi.

evdiny returns the Y* matrix, which is a one vertical column matrix
      (aka vector) of EVD transformed trait values.

evdinz returns the Z matrix, which has has a column of 1's and a column
       of lambda's.

evdinev returns a list of eigenvector matrices, one for each family,
      or if the -all option is specified, just one matrix for the entire
      sample.  In order to use the -all option, you must have previously
      done evdout with the -evectors and -all options.

Example:

      foreach covar $testvars {
          model new
          trait q4
          covar $covar
          evdout
          set X [evdinx]
          set Y [evdiny]
          solve $X $Y
      }

Return to command index

A1.46 evdout


solar::evdmat --
solar::evdinx --
solar::evdiny --
solar::evdinz --
solar::evdinev --

Purpose: I/O for EVD data

Usage:   trait ...
         covar ...

         evdout [<-evectors>[<-all>]] ;# write evddata.out,evectors if asked

         evdinx  [<evdfile>] ;# return X* matrix from evddata.out file
         evdiny  [<evdfile>] ;# return Y* matrix from evddata.out file
         evdinz  [<evdfile>] ;# return Z* matrix from evddata.out file

         evdinev [<-all>]  ;# load eigenvector matrix(es) from file(s) saved
                           ;# by evdout -evectors

Note: You must select trait, covariates, and anything else that would
      restrict the sample size before invoking evdout.  You
      do not need a maximized model, just trait and covariates.

evdout writes out evd transformed variables and the eigenvalues
      (which are called lambda) to a file named evddata.out in the
      maximization output directory but without actually doing a
      model maximization.  The trait value, which is not demeaned,
      is written to variable <traitname>_evd.  The covariate values,
      which are demeaned or scaled appropriately, are written to
      variables named evd2_<varname>_evd.  Ignore other fields.

      Additionally, you can write out the eigenvectors, either in
      normal per-pedigree mode (one matrix for each family) or in
      entire-sample mode if you select the -all option.  Filenames are
      evectors.family<n>.mat.csv for per-family matrices and
      evectors.mat.csv for entire sample (-all) matrix.  Matrix files
      are written to the outdir.

      Current methods rely on the EVD transformation of variables
      and do not require the Eigenvectors to be output, so generally
      speaking you should not use the -evectors option unless you
      know you need it.

evdinx returns the X* matrix corresponding to current evddata.out.
      The X* matrix has from left to right: (1) a column of 1's, (2+) one or
      more columns of EVD transformed covariates which have been scaled to
      mean of zero, in the order in which covariates occur in the model.
      evdinx has a -method2 option needed for the -method2 option of fphi.

evdiny returns the Y* matrix, which is a one vertical column matrix
      (aka vector) of EVD transformed trait values.

evdinz returns the Z matrix, which has has a column of 1's and a column
       of lambda's.

evdinev returns a list of eigenvector matrices, one for each family,
      or if the -all option is specified, just one matrix for the entire
      sample.  In order to use the -all option, you must have previously
      done evdout with the -evectors and -all options.

Example:

      foreach covar $testvars {
          model new
          trait q4
          covar $covar
          evdout
          set X [evdinx]
          set Y [evdiny]
          solve $X $Y
      }

Return to command index

A1.47 example


Purpose:  Copy the SOLAR example to the current working directory

Usage:    example

Notes:    The example may be used in conjunction with the SOLAR tutorial
          in Chapter 3.  To read that, give the command "doc -chapter 3"

          The example files are actually located in the doc/Example
          subdirectory of the SOLAR installation.  To find the "doc"
          subdirectory, give the command "doc -whereis"

Shortcuts: examp - example

Return to command index

A1.48 exclude


Purpose:  Excude phenotypes from use as covariates by automodel
          and allcovar commands.

Usage:    exclude <var> <var> ... ; Add variable(s) to exclude
          exclude                 ; List all excluded variables
          exclude -reset          ; Reset to default exclude list
          exclude -clear          ; Remove all variables from list

Notes: You may add to the exclude list with one or more exclude commands.

       By default, all variables named and/or mapped by the FIELD command
       will be excluded (except for SEX).  The exclude command lets you
       exclude additional variables.  (The FIELD command variables are
       pedigree variables such as ID which would never be wanted as
       covariates.)

       The default exclude list will include the following standard PEDSYS
       pedigree mnemonics: 

         seq fseq mseq sseq dseq ego id fa mo sire dam pedno famno twin
         mztwin ibdid fibdid mibdid blank kid1 psib msib fsib birth exit

       If you are excluding more variables that you are keeping, you might
       consider simply specifying the covariates you want explicitly
       rather than using the allcovar or automodel commands, or creating
       a new phenotypes file with fewer fields.

       The variable name you enter will be converted to lower case.  Solar
       is intended to handle phenotypic and pedigree variables in a case
       insensitive manner.

Shortcuts: excl - exclude

Return to command index

A1.49 factorial


Purpose:  Compute factorial

Usage:    factorial N

Example:  set big [factorial 10]

Notes:    A double precision value is returned, since double precision can
          represent larger numbers exactly than integers.

          Non-integral N is rounded to the nearest integer first, then
            the factorial is computed for that integer.

          For large enough N, the value returned might not be exact.
             (Currently this happens for N > 18.)

          Negative N (after rounding) raises a range error

          This may be, but need not be, used in an "expr".

Return to command index

A1.50 fakedata


Purpose: Generate fake pedigree/phenotypes file for testing memory required

Usage:   fakedata <n_individuals> <n_traits>

fakedata.out is written with 4-person families. Trait data is random.

Note: The intended use is testing whether memory is sufficient to handle
a particular number of traits/individuals/pedigree-size.

Return to command index

A1.51 faketraits


Purpose: Generate fake pedigree/phenotypes file for testing memory required

Usage:   faketraits <n_individuals> <n_traits>

faketraits.out is written with 4-person families. Trait data is random.

Note: The intended use is testing whether memory is sufficient to handle
a particular number of traits/individuals/pedigree-size.

Return to command index

A1.52 fformat


Purpose:  Replace Tcl format with fixed width fields for numbers

Usage:    fformat <spec>+ <value1>+

          <spec>   format specifier(s) as for Tcl format command.
                   f, e, or g format required for "fixed width"
                   operation, like this:

                  %[--][W][.P]T  where T is e, f, g, or y
                     default right justification
                  -  specifies left justification
                  --  specifies center justification
                  W is desired width
                  P is desired precision (before and after decimal)
                  T is format type:
                    f is floating decimal
                    e is exponential
                    g is floating decimal if suitable, then exponential
                    y same as g, except that exponential format is not
                      used until the output would otherwise be 0.0 or
                      nearly so for a non-zero value.  At least one
                      significant digit is preserved for P 1-4, two 
                      digits for P 4-6, and three digits for P  7-*.
                      This is more consistent with readability, 
                      retaining the fixed format nearly as long as
                      possible.  Sometimes, more space will be used than W,
                      but this is much less likely than with the standard
                      G format.  However, unlike F format, the the result
                      will not go to zero unless it is zero.  When possible,
                      allow more space in "width" than the precision seems
                      to require.  That way, under special circumstances,
                      there is extra space for signs, "e", decimal point,
                      etc.
                    z same as y, except resulting string is trimmed to
                    minimum space for csv files

                   Note: For fractional numbers, make width at least 2
                   than precision, to allow for leading "0."  Then allow
                   one more for - sign, if that is possible.

         This is intended as a drop-in replacement for the Tcl "format"
         command, modifying "minimum width" to "fixed width" for
         the f, e, and g formats ("fixed width" makes for more
         readable columns) and adding a center justification option.

Return to command index

A1.53 field


Purpose:  Allow non-standard user data field names

Usage:    field                              ; This shows all mappings
          field <default name> <user name>   ; Create one mapping 
          field <default name>               ; Show one mapping
          field <default name> -none         ; Ignore this field (see notes)
          field <default name> -default      ; Restore full default

Examples: field ID Subject
          field FA Father
          field MO Mother

Notes:

The default names are ID, FA, MO, PROBND, MZTWIN, FAMID, SEX, and HHID.
EGO, SIRE, and DAM are also permitted in place of ID, FA, MO by default.
However, unlike the default, you can only specify one name to be
mapped.  However, you can restore the full default for any field
using the -default argument.

The -none argument declares a field to be ignored whether it exists or
not.  This is useful, for example, if you want the optional PROBND
field to be ignored:

    field PROBND -none

This would signify that there is no PROBND field, i.e. there are no
probands.  PROBND, MZTWIN, and HHID are optional fields for which the
-none argument may be used.  Most other fields are mandatory and -none
cannot be used for them.

Your field selections are saved for future SOLAR sessions to a file
named field.info in the working directory.  Once you have entered
your field selections, you need not enter them again (starting with
version 2.0.6) when you are running SOLAR from the same working
directory.  However, if you followed our previous recommendation to
put field commands in a .solar file, note that the settings in the
.solar file take precedence over the settings in field.info.

If you would like to remove an entire old set of field assignments,
you can delete the field.info file BEFORE starting SOLAR.  (SOLAR
reads the field.info file when starting.)

FAMID field may or may not be required depending on whether your ID's
are unique in the entire dataset.  If your ID's are unique, you do
not need FAMID.  However, if your ID's are sequential within each
family, you need a FAMID field in both your pedigree and phenotypes
files, otherwise they are ambiguous.  SOLAR now catches this mistake
when either pedigree or phenotypes files are loaded.


Shortcuts: fie - fields

Return to command index

A1.54 file-freq


Purpose:  Describe frequency data file requirements

The freq file contains allele frequency data for a set of marker loci,
one line per marker.  Each line consists of the following space-delimited
fields:

    marker name, all_1 name, all_1 freq, all_2 name, all_2 freq, ...

The allele frequencies for a marker must sum to 1 (a small roundoff error
is tolerated.)

Allele frequency information is used when IBDs are computed for a marker
that is not completely typed, i.e. there are individuals for whom genotype
data is not available.

Example:

D20S101 123 0.2457 127 0.1648 133 0.5895
IGF1 A 0.4 B 0.3 C 0.1 F 0.2
ApoE E1 .125 E2 .25 E3 .625

Once a freq file has been loaded, it is not necessary to load it again
in subsequent SOLAR runs from the same working directory.

Shortcuts: file-f - file-freq

Return to command index

A1.55 file-map


Purpose:  Describe map data file requirements

The map file contains chromosomal locations for a set of marker loci
on a single chromosome.  Typically, marker locations are given in cM
and a mapping function is used to convert inter-marker distances to
recombination fractions.  Currently, the Kosambi and Haldane mapping
functions are allowed.  Marker locations can also be specified in
basepairs.  While cM locations can be floating point numbers, basepair
locations must be integers; non-integer locations are truncated to
integers.  When basepair locations are used, the mapping function is
called "basepair" rather than Kosambi or Haldane, but in fact there
is no mapping provided from basepairs to recombination fractions and
such maps cannot be used to compute multipoint IBDs.  The first line
of the map file contains the chromosome number, and (optionally) the
name of the mapping function.  If no mapping function is specified,
the mapping is assumed to be Kosambi.  The chromosome number can be
any character string not containing a blank or a forward slash (/),
although the use of integers is recommended.  For example, the strings
'01' and '10q' are allowed.  Each line after the first line consists
of the following space-delimited fields:

    marker name, marker location

Examples:

20
D20S101         0.0
D20S202        34.2
D20S303        57.5

TCF basepair
2448b   19828941
380659  19829489


Shortcuts: file-map - file-map

Return to command index

A1.56 file-marker


Purpose:  Describe marker data file requirements

The marker file contains genotype data for one or more marker loci.
The file consists of one record for each individual who has been typed
for one or more of these markers.  Each record must contain the following
fields:

    ego ID, genotype1, genotype2, ...

In addition, a family ID field must be included when ego IDs are not
unique across the entire data set.  If, however, each ego ID is unique
to an individual and an individual may appear multiple times in the
data set, then the family ID should not be included.  The same genotypic
data is then associated with every occurrence of an individual.

The default field names are ID and FAMID.  EGO is also accepted by
default.  You can set up SOLAR to use different field names by using
the field command (see 'help field').  You do not necessarily need to
change your names to match ours.

Fields with names other than ID and FAMID are assumed to contain marker
data, with the exception of the following names: FA, MO, SEX, MZTWIN,
HHID, AGE, PEDNO, and GEN. Fields having one of these names are ignored.

The scheme used to encode genotypes may vary from field to field.
SOLAR recognizes many standard coding schemes, but the safest way to
code genotypes is with the forward slash to separate the alleles.

Ex: AB
    E1 E3
    123/456

A blank genotype field denotes missing data, as do the genotypes 0/0
and -/-.  SOLAR requires that either both alleles are typed or both
alleles are missing, except for male genotypes at X-linked marker loci.
In that case, either a single allele is specified (the other allele is
blank, 0, or -), or the genotype is coded as a "homozygote".

Ex: 237/243   valid female X-linked marker genotype
       /251   valid male X-linked marker genotype
      251/0   valid male X-linked marker genotype
      -/251   valid male X-linked marker genotype
    251/251   valid male X-linked marker genotype

The marker loci in the marker file must all be autosomal or all be
X-linked.  By default, SOLAR assumes that the markers are autosomal.
If the markers are X-linked, then either the XLinked option must be
set with the ibdoption command prior to loading the marker file, or
the -xlinked option must be given in the load marker command.

Once a marker file has been loaded, it is not necessary to load it
again in subsequent SOLAR runs from the same working directory.

Shortcuts: file-mar - file-markers

Return to command index

A1.57 file-matrix


Purpose:  Describe csv matrix file format requirements

CSV matrix files were introduced in SOLAR version 7.5.0, and it is
conditionally recommended that all users who are creating their own matrix
files use this format, as it is more easily understood and created than the
previous format.  Matrix writers should beware that there are many possible
pitfalls in hand writing a matrix file, and matrix files should at minimum
be checked with the "matrix debug" command after loading the first time.

The original space column delimited matrix file is still created and used by
SOLAR itself and documented in the manual, and if you merely wish to modify
an existing matrix file, it may still be easiest to use the original format.
The format of the phi2.gz kinship matrix file is described in Section 8.3
of the manual, and the same rules would apply to any original format matrix
file.

The csv matrix file is an ordinary comma separated variable file with the
first line being a header which names all the fields used, as is common
with CSV files.  However SOLAR requires that all matrix files, including
csv matrix files, be compressed using the gzip program.  Thus all matrix
files will have the final filename extension ".gz".

All csv matrix files must have the following three required fields:
id1, id2, matrix1.  id1 and id2 are user assigned ID's, as used in pedigree
and phenotypes files.  matrix1 is typically the primary matrix in the file,
for example the primary matrix in the phi2.gz file is the "phi2" matrix.  A
csv matrix file may also have a second matrix named matrix2.  If FAMID's are
required to disambiguate ID's in your dataset, you must also include
famid1 and famid2, the famid's corresponding to id1 and id2.  SOLAR will
determine whether famid's are needed or not from the pedigree file.  If
famid's are not needed, famid1 and famid2 will be ignored if they are
present.  If famid's are needed and not present in the matrix file, the
load matrix command will generate an error and the matrix will not be loaded.

Note that is the "load matrix" command which actually assigns meaningful
names to matrixes for use in the SOLAR omega.  For example, the phi2
matrix is usually loaded with the following command:

  load matrix phi2.gz phi2

Alternatively, an analysis examining dominance would require a second
matrix, delta7:

  load matrix phi2.gz phi2 delta7

In both cases the first matrix (called matrix1 in a CSV matrix file) will
be associated with the name phi2 in the omega.  The second command will
also associate the second matrix (matrix2) with the name delta7.  Neither
command shown let you directly use the names matrix1 and matrix2 in the
omega, unless those names were also specified in the load matrix command.
The CSV naming scheme is intended to allow the "load matrix" command to be
used exactly as it was before.

Variable names other than id1, id2, matrix1, matrix2, famid1, and famid2
in a CSV matrix file will be (in this version) ignored.

There are other semantic requirements for matrix files, depending
on the type of matrix involved.  One is that every individual in an analysis
must be included in the matrix file regardless of whether they have pairwise
relationships with others in the file.  At minimum every individual has a
self relationship of value 1.0.  With version 7.5.0 it is required that you
include these "diagonal" matrix values of 1.0 for every individual in your
sample, and possibly everyone in your pedigree file.  Otherwise, diagonal
values will default to -1 which could have bad consequences.  (Note: in
version 7.5.0 it is not possible to check this with matrix debug because
it only checks values in the input file, not defaulted values in the
matrix itself.)

It is not permitted to have individuals in the CSV matrix file who are not
defined in the pedigree file.

Historically matrix files were dependent on the pedindex.out files created
when the pedigree was loaded.  This is because the very IBDID's used in
original format matrix files might be assigned to different actual ID's
if the pedigree is changed.  CSV format matrix files are less dependent,
but it is still likely that changes to a pedigree file would require
corresponding changes in the matrix file.  And sometimes this may be
overlooked, causing disasterous results.  SOLAR has long prepended a
pedigree checksum record to all the matrix files it creates which are checked
against the pedigree file when the matrix is loaded.  Changes to the
pedigree after the matrix file was created will cause an error to
be raised when attempting to load that matrix file.

The same checksum checking features can optionally be used in CSV matrix
files.  Once the matrix has been created and compressed using gzip, the
procedure "matcrc" can be run on them, for example:

solar> matcrc phi2.csv.gz

This will determine a checksum value from the currently loaded pedigree
file, and prepend this to the matrix file in a record with id1=checksum
and id2=checksum.  It also will gunzip the matrix file to perform this
change, and gzip after the change has been made.

If the names assigned to matrixes (in the load matrix command) begin with
"ibd" or "mibd" a special defaulting rule applies.  -1 values found in these
matrixes mean that the actual value should be taken from the phi2 matrix
(for the first matrix) or delta7 (for the second matrix).  Furthermore, if
-1 is READ FROM THE FILE for a diagonal matrix entry, the default is at
that time applied to every pair including that individual.  (On the other
hand, if a -1 occurs on the diagonal only because the diagonal entry was
missing from the file, defaulting would occur for the missing diagonal but
not for every other pair including that individual, the default value for
those other pairs would be zero.)  This feature had historical importance
but is considered obsolescent now and is not recommended for use in new
matrix files.
 

Return to command index

A1.58 file-pedigree


Purpose:  Describe pedigree data file requirements

The pedigree file consists of one record for each individual in the data
set.  Each record must include the following fields:

    ego ID, father ID, mother ID, sex

In addition, a family ID is required when ego IDs are not unique across
the entire data set.  If the data set contains genetically identical
individuals, an MZ-twin ID must be present (as described below).  If an
analysis of household effects is planned, a household ID can be included
(also described below).

The default field names are ID, FA, MO, SEX, FAMID, MZTWIN, and HHID.
EGO, SIRE, and DAM are also accepted by default.  You can set up SOLAR to
use different field names by using the field command (see 'help field').
You do not necessarily need to change your names to match ours.

A blank parental ID or a parental ID of 0 (zero) signifies a missing
parent.  SOLAR requires that either both parents are unknown, i.e. the
individual is a founder, or both parents are known.

If the pedigree data consists of unrelated individuals with no parental
data, then the father ID and mother ID fields are not required. If there
are parents for whom pedigree file records do not exist, then records
are created internally for those parents, who are assumed to be founders.

Sex may be encoded as M, m, or 1 for males and F, f, or 2 for females.
The missing value for sex is 0, U, u, or blank.

The MZ-twin ID is used to designate genetically identical individuals,
e.g. monozygotic twins or triplets.  Each member of a group of identical
individuals should be assigned the same MZ-twin ID.  Twin IDs must be
unique across the entire data set.  If there are no genetically identical
individuals in the data set, this field need not be present in the
pedigree file.

The household ID, if present, will be used to generate a matrix file
(house.gz) that can be used later to include a variance component for
household effects.  Household IDs must be unique across the entire data
set.

The family ID field is required only when ego IDs are not unique across
the entire data set.  For example, if a data set consists of nuclear
families, and the same ego ID may appear in more than one family, then
the family ID must be included.  Or if, for example, IDs are sequential
integers unique only within pedigrees, then the pedigree ID must be
included.

At the time the pedigree file is loaded, SOLAR indexes the data set.
This indexing is internal and should not be confused with any external
indexing the user may have imposed upon the data set.  This indexing
information is stored in a file named 'pedindex.out' in the directory
where SOLAR is running when the pedigree data is loaded.  Be careful
about deleting files unless you are sure they are not needed by SOLAR!  

Once a pedigree file has been loaded, it is not necessary to load
it again in subsequent SOLAR runs from the same working directory.

Shortcuts: file-pe - file-pedigrees

Return to command index

A1.59 file-phenotypes


Purpose:  Describe phenotypes data file requirements

The phenotypes file may be in either PEDSYS or Comma Delimited format.

The phenotypes file consists of one record for each individual.
Each record must include an ego ID and one or more phenotypic
values (which may be blank to signify missing data).

   ego ID, phen 1, phen 2, ...

(The phenotypes file may also contain other data, such as pedigree
data.  You could use one file as both your phenotype and your
pedigree file, though that is not necessarily recommended.  There
are fewer possible problems with separate files.)

Just as with the pedigree file, a field name FAMID is required when
IDs are not unique across the entire data set.  (If your ego IDs
are unique, it is probably better _not_ to include a family ID,
as it just complicates things slightly.)

If your data has probands and you wish to employ ascertainment
correction, the phenotypes file must have a proband field.  In this
field, blank ( ) or zero (0) signifies non-proband, and anything
else signifies proband.  A decimal point is _not_ permitted after
the zero.  The presence of a proband field automatically turns on
ascertainment correction.

The default field names are ID, FAMID, and PROBND.  You can set up
SOLAR to use different field names by using the field command.

The phenotype field names may be anything within certain rules.
(no spaces, tabs, or slashes; also certain special characters such
as *#,^/-+ can cause problems in the names of phenotypes used
as covariates).  If you stick with alphabetic characters, numeric
characters, and underscores you will be safe.

The phenotype data fields must be numbers, either with or without
decimal points.  Zero (0) is always considered a permissible value;
blank ( ) or null (e.g. no value in between the commas ",," in a
comma delimited file) must be used to signify missing values.

Floating or fixed point numbers must always include a decimal
point; numbers without a decimal point are assumed to be integers.
Binary, discrete or categorical values should be indicated with
consecutive integers (e.g. 0,1 or 1,2 or 2,3).  SOLAR checks all
phenotype fields to see if they contain only two consecutive
integers and judges them "binary" if they do.  Binary traits are
automatically handled by the SOLAR discrete trait liability
threshold modeling code; you don't need to do anything special.
See Chapter 9 for discussion on what to do with "categorical"
data that has more than two categories.

Without special scripting, categorical phenotypes with more than two
categories should not be used in SOLAR.  (SOLAR will not identify
categorical phenotypes with more than two categories and instead
treat them as quantitative phenotypes.)

The 'load phenotypes' command creates a file named phenotypes.info
in the working directory.  Once a phenotypes file has been loaded,
it need not be loaded again in the same working directory, unless
you change the file itself.

SOLAR automatically removes pedigrees in which no non-proband has
all required phenotypic data from the analysis.  You need not
remove these pedigrees yourself.  You will get a full accounting of
pedigrees and individuals included and excluded in the maximization
output files (described below) , by running the 'maximize' command,
or giving the 'verbosity max' command prior to other commands.


Shortcuts: file-ph - file-phenotypes

Return to command index

A1.60 finemap


Purpose:  Set fine mapping threshold for multipoint

Usage:    finemap <LOD> [<LOD> ...]
          finemap default
          finemap off
          finemap                  {displays current finemap setting}

Example:  finemap 0.588

Notes:    After each multipoint pass when the interval is greater than 1
          SOLAR will examine all points in the regions around points
          higher than some threshold.  This threshold is set with the
          finemap command.

          The default is 0.588.

          Finemapping can also be turned off.  The finemap setting is
          unimportant when the interval is 1.  (Note: versions of SOLAR
          prior to 1.1.0 did finemapping only around the single highest
          peak by default.)
          

Shortcuts: fine - finemap

Return to command index

A1.61 fix


Purpose:  Constrain a parameter to its current value

Usage:    fix <name>    ; Name is the name of the parameter you want to fix

Example:  fix h2r

Shortcuts: fix - fix

Return to command index

A1.62 fphi


Purpose: Fast test and heritability approximation 

Usage: fphi [optional -fast  -debug -list <file containing trait names>
       -precision <h2 decimal count> -mask <name of nifti template volume>
        -evd_data <base filename of EVD data] 

  -fast Performs a quick estimation run 
  -debug Displays values at each iteration 
  -list performs fast fphi on a list of trait (does not include covariate data)
  -precision number of decimals to calculate h2r
  -mask outputs fphi -fast results of the list of voxels from -list option  
  -evd_data When using the -list option the EVD data option can be used to avoid
  having to calculate EVD data within the command      
  
 Fast permutation and heritability inference (FPHI). FPHI is based on the 
eigenvalue decomposition on the kinship matrix and
a search through values of h2r for accurate approximation of heritability values
and statistical inference. The default setting is the full search out to 9 decimal places
of h2r. The "-fast" option uses the Wald approximation. Both use log likelyhood p-value
estimation.  The default setting should provide very accurate h2 estimates 
that are nearly identical to the standard maximum likelihood inference.
The h2 values obtained using Wald approximation are usually within 3% of 
the classical MLE values.  The same functionality is available for GPU 
computing. Use gpu_fphi for heritability calculations in very large datasets.
For details see Ganjgahi et al., "Fast and powerful heritability inference 
for family-based neuroimaging studies”. 

Return to command index

A1.63 freq


Purpose:  Process the allele frequency data.

Usage:    load freq [-nosave] <filename>    ; loads freq file
          freq unload                ; unloads allele frequencies
          freq mle [-nose] [-hwe] [<marker> ...]
                                     ; computes MLE allele frequencies
          freq save <filename>       ; saves allele frequencies to a file
          freq show [<marker> ...]   ; displays allele frequencies

          The 'freq unload' command will not unload allele frequency
          information for markers with currently loaded genotype data.
          Frequency data for such markers is unloaded at the time the
          genotype data is unloaded with the 'marker unload' command.

          In general, it is not necessary to unload existing frequency
          data before loading a new freq file; the unloading will be done
          automatically.  However, the 'freq load' command will not replace
          previously loaded frequency data if MLE allele frequencies have
          been computed but not saved for one or more markers.  In that
          case, the MLEs must be saved to a file, or the -nosave option
          must be specified.

          MLE allele frequencies are computed only for markers with
          currently loaded genotype data.  To load genotype data, use
          the 'load marker' command.  If a marker name is not specified
          for the 'freq mle' command, MLE allele frequencies will be
          computed for all markers with currently loaded genotype data.
          By default, standard errors are computed for the marker allele
          frequency estimates.  These additional calculations result in
          increased compute time, but can be avoided with the '-nose'
          option if standard errors are not required.

          When the '-hwe' option is given, the 'freq mle' command carries
          out an additional likelihood maximization for each marker, in
          this case finding MLEs for marker genotype frequencies rather
          than allele frequencies.  A test of whether the assumption of
          Hardy-Weinberg equilibrium holds is provided by comparing the
          likelihood of the genotype frequency-based model with that of
          the allele frequency-based model (which assumes HWE).  When
          this test has been conducted, the associated p-values will be
          displayed by the 'marker show' command.
          
          The file created by the 'freq save' command is written in a
          format suitable for subsequent loading with the 'freq load'
          command.

          If a marker name is not specified for the 'freq show' command,
          all currently loaded frequency information will be displayed.

          The currently loaded allele frequency information is stored in
          the file 'freq.info' in the current working directory.  This
          file persists between SOLAR runs, which means that the allele
          frequencies which are in effect at the end of a session will
          still be in effect the next time SOLAR is invoked (from within
          the same working directory.)

Notes:    The set of markers in the freq file and the set of markers in
          the marker file do not have to be the same.  Allele frequencies
          will be computed for markers that do not appear in the freq file
          at the time these markers are loaded with the 'load marker'
          command.

For a description of the freq file, enter 'file-freq'


Shortcuts: fre - freq

Return to command index

A1.64 full_filename


Purpose:  Prepend the maximization output directory name to filename(s)

Usage:    full_filename [<filename>]+

Note:     See "help outdir".  full_filename is intended for scripts.

Return to command index

A1.65 gen_corr


Purpose: Calculates the genetic correlation between two traits.

Usage: gen_corr [options --pvalues --debug --evd_data <basename of EVD file names>]

  All options are not required.
  --pvalues gives the pvalues of rhog and rhoe
  --debug display all output from each iteration from each numerical differentiation delta
  --evd_data <basename of EVD file names> allows the user to enter the base filename of the
   output of create_evd_data so that prolonged eigenvalue decompositions across data sets
   with similiar ID sets don't need to be repeated.
  
  Example
  trait trait_one trait_two
  gen_corr --debug    

Return to command index

A1.66 gpu_fphi


Purpose: Fast test and heritability approximation performed on the GPU

Usage: gpu_fphi -list <trait list file name> -o <output file name>
                [optional -all_gpus -use_covs -thread_size <number of threads used by gpu kernels>
		--gpus <list of gpu ids separated by commas to be used in analysis>
		--display-gpus --verbose --batch_size <number of traits to calculate at once per stream>
		--n_streams <number of gpu streams per gpu> --evd_data <base filename of output of create_evd_data>
		--nifti <nifti template filename must have extension .nii.gz>]
Returns: Outputs H2r, standard error of H2r and p-value of H2r to <output file name> or default output filename. 

Requirements:
 	-Dynamically linked version of solar-eclipse
      -CUDA capable NVIDIA graphics card with a minimum architecture of 3.5
      -Linux version of solar-eclipse
-CUDA drivers must be from toolkit 9.0 or later (Not applicable if built locally from source code)
      They can be downloaded https://developer.nvidia.com/cuda-10.2-download-archive

    -list  File containing all the traits that gpu_fphi will process
   -thread_size Number of GPU threads used by GPU kernels must be 32,64,128,256,512, or 1024
   -o <output filename or base output filename if -nifti option is used> default name is gpu-gwas.out
   -all_gpus  Option to select all usable CUDA Capable GPUs
 -gpus Allows for the selection of certain GPU from the start using their GPU IDs
   -batch_size Number of trait to computed at once per stream
   -n_streams Number of streams to be used per GPU
 -evd_data Uses precalculated EVD data from create_evd_data command
 -nifti Uses a template nifti file to output H2r, standard error, and p-values into image volumes with
 dimensions and orientation taken from specified template file
 -display_gpus Displays all usable GPUs that an be used in analysis
 -use_covs Includes covariates that were selected with covariate command

Notes: The trait list file name should be setup as a file in which all the traits are listed 
and separated by spaces or newlines.  If you are experiencing difficulty in running
this program please try the following:
  -Ensure your GPU/GPUs are NVIDIA CUDA GPUs that have a device architecture greater than equal
	to 3.5.
  -If you wish to run more than one GPU at once be sure your driver's compute mode is set to DEFAULT or 0.  
  This can be checked using the command nvidia-smi -c.
  -Make sure the CUDA drivers installed are from at least toolkit 9.0
  -Try closing all processes currently running on the GPU to free up memory and device occupancy.  

Return to command index

A1.67 gpu_gwas


Purpose: Calculates pvalues on an nvidia gpu for snps given a loaded phenotype containg columns
labeled "snp_" or a plink file that contains ids that correspond to the phenotype
and pedigree ids.  

Warning: With this version of SOLAR gpu_gwas is still in experimental stages.  The stability of
the command is the issue, not the results of the command. 

Usage: gpu_gwas -plink <plink base filename> 
[optional:  -all  -list <filename containing list of traits>
        --evd_data <base filename from create_evd_data output> --precision <precision of h2r must be integer [1,6]
      --stream_count <number of stream per gpu> --batch_size <number of snps processed by a stream at once>
      --size <number of gpu threads must be 32,64,128,256,512,or 1024> --verbose
        --gpus <list of gpu ids separated by no space commas> --screen 
      --calibrate <percentage of SNPs to calibrate with> --o <base output filename>
        --blank_ped ] 
 
Requirements:
 	-Dynamically linked version of solar-eclipse
      -CUDA capable NVIDIA graphics card with a minimum architecture of 3.5
      -Linux version of solar-eclipse
-CUDA drivers must be from toolkit 9.0 or later (Not applicable if built locally from source code)
      They can be downloaded https://developer.nvidia.com/cuda-10.2-download-archive

  Command Arguments:
  
  -plink <plink base filename> Base filename for plink .bed,.bim., and .fam.  This variable is required

  -o <filename> Uses this filename as opposed to the default output fileaname

  -all  Selects all availible usable NVIDIA GPUs 

  -evd_data <base filename from create_evd_data> Allows gpu_gwas to avoid it's own EVD calculation 

  -precision <integer in range [1,6]> Determines how accurate the value of h2 in null and 
   model calculations Default: 5

  -stream_count <integer number in range [1,6]> Number of GPU streams per GPU. Each GPU has a number
   of independant streams that read file data, transfer it to the GPU, perform calculations, and 
   finally transfer the results back to the CPU Default: 2

  -batch_size <integer number in range [1,10000]> Number of SNPs calculated per iteration of a 
   GPU stream.  It's possible a number within this range could be too large, which will lead
   to an error or crash.  Calibrate it relative to the number of subjects. If not value is
   specified then this command to estimate the largest number given memory constraints

  -size <integer number from set {32,64,128,256,512,1024}> Number of GPU threads used in GPU 
   kernels.  Different values cause different runtimes.  By default this command will try 
   to estimate the best thread size for the data set

  -verbose Option that displays progress

  -gpus <GPU integer IDs separate by commas> List of NVIDIA GPU IDs that the command will use
                                    
  -screen Option to run screen mode, which estimates p-values quickly without having to converge 
  
  -blank_ped Calculates p-values for SNP data without using a pedigree, must be used with -screen 
  
  -calibrate <decimal in range (0,1)> Compares the runtimes of different possible options using 
   a fraction of the full data set, which is the argument of the option

	The GWAS function performs an examination of a genome-wide set of genetic variants and calculate p-values for 
the association between trait and individual SNPs. The output is saved as a csv file labeled either 
<trait name>-gpu-gwas.out or <trait name>-gpu-gwas-screen.out unless a name specified with --o is given. 
Covariates can be factored in by running sporadic_normalize on the trait beforehand.  Missing SNP values are automatically
set two times the frequency of the major allele.  The -screen option runs a fast version of the algorithm that only computes 
rough estimate of the p-value and beta.  -evd_data option is highly recommended for data sets with a large number of subjects.
Note if precision is set to 6 depending on the gpu there may be convergence issues.  By default precision is set to 5.  
You can test for optimal thread count, thread count can be set with --size option, by running --calibrate.  
Traits are selected with either the trait command or --list option if you require to run gpu-gwas on a list of traits. 
-stream_count is recommended to be set to 2 and --batch_size is recommended to be set to 500 or 1000 depending 
on the number of subjects.      

Example:

covar age*sex
sporadic_normalize -header voxel_list.header -out voxels.csv
load phenotype voxels.csv
gpu_gwas -list voxel_list.header  -plink snp_file --precision 4 --size 256 --batch_size 500 --stream_count 2

Return to command index

A1.68 gpu_pedifromsnps


Purpose: Calculates empirical pedigree matrix from a set of PLINK files using NVIDIA GPU hardware 

Usage: gpu_pedifromsnps -i <filename> -o <filename> -freq <filename>
       [optional: -corr <decimal> -gpus <list of GPU ids> -all -normalize
                  -batch_size <integer number> -thread_size <integer number> -snp_stride <integer number>
                  -per-chromo -id_list <filename> ]            
Requirements:
 	-Dynamically linked version of solar-eclipse
      -CUDA capable NVIDIA graphics card with a minimum architecture of 3.5
      -Linux version of solar-eclipse
-CUDA drivers must be from toolkit 9.0 or later (Not applicable if built locally from source code)
      They can be downloaded https://developer.nvidia.com/cuda-10.2-download-archive

  Command Arguments:
  
  -i <plink base filename> Base filename for plink .bed,.bim., and .fam.  Required

  -o <filename> Output filename of the matrix calculated. Required

  -freq <filename from plink_freq command> Filename of the output calculated 
   by the plink_freq command. Required

  -all  Selects all availible usable NVIDIA GPUs  

  -snp_stride <integer in range [1,10]> Number of SNPs calculated perform GPU 
   kernel block

  -id_list <Name of file containing subject IDs> Output will only contain subjects with ids
   included in the specified file

  -per-chromo Creates a separate matrix for each chromosome found within the plink file

  -batch_size <integer number in range [1,20000]> Number of SNPs calculated per iteration of a 
   GPU stream.  It's possible a number within this range could be too large, which will lead
   to an error or crash.  Calibrate it relative to the number of subjects. If not value is
   specified then this command to estimate the largest number given memory constraints

  -thread_size <integer multiple of 32 no greater than 1024> Number of GPU threads used in GPU 
   kernels. 

  -corr <decimal value> Used to determine the exponent of the variance. Default value is
   -1. Other values are discouraged

  -gpus <GPU integer IDs separate by commas> List of NVIDIA GPU IDs that the command will use
                                    
  -normalize Option to normalize the final result such that the diagonal contains only 1's 
  

	gpu_pedifromsnps is a GPU optimized version of the pedifromsnps command.  
  This command only allows for the method one algorithm that's described in help pedifromsnps.      

Example:

covar age*sex
sporadic_normalize -header voxel_list.header -out voxels.csv
load phenotype voxels.csv
gpu_gwas -list voxel_list.header  -plink snp_file --precision 4 --size 256 --batch_size 500 --stream_count 2

Return to command index

A1.69 grid


Purpose:  Find the highest likelihood in the vicinity of marker(s)

Usage:    grid <marker1> [<marker2> ...]

Example:  grid APOa D6S2436

Notes:    outdir (or trait) and ibddir must previously be specified.
          ibd matrices for each marker must already have been computed.

          A model named "null0.mod" is expected in the output directory.
          That can be created with the polygenic command.

          Summary output is displayed on your terminal and written to a file
          named grid.out.  An output and model file are saved for each
          marker with the name <markername>_best_theta.

          The twopoint command also has a "-grid" option which will grid
          around every marker evaluated.

          A special "-bsearch" option sets point at which a "golden section"
          search begins.  By default, grid single-steps by 0.01 from 0.00 
          to 0.05 and then begins a golden section search.  (This is on the
          assumption that the peak will be reached before 0.05 in the vast
          majority of cases.)  If you have a significant number of cases
          above 0.05, you might want to change this, for example:

               grid mrk -bsearch 0.01

          would start the golden section search after 0.01 (which will be
          faster if the value is expected to be greater than 0.05, but
          slower if the value is going to be less than 0.05).  Note: 0.01 is
          the smallest value at which the search can begin.  On the other
          hand if you wanted to single-step all the way to 0.10, you would
          give the command:

               grid mrk -bsearch .1


Shortcuts: grid - grid

Return to command index

A1.70 gridh2r


purpose: grid around the h2r value in polygenic model

usage:   gridh2r [-lower <lower>] [-upper <upper>] [-step <step>]

         -lower <lower>  Lowest h2r; default is current value less 0.1
         -upper <upper>  Highest h2r; default is current value plus 0.1
         -step <step>    step; default is 0.01

Notes:   polygenic should be run first.  Only univariate models with
         only e2,h2r parameters are supported.

         Out is written to a file named gridh2r.out in the maximization
         output directory.  The starting model is listed first regardless
         of whether it is in the range specified.
     
         After completion, the model with the best loglikelihood will
         be loaded, but with the h2r constraint (if any) deleted.  This
         might be the starting model even if it isn't in the specified
         range.

         Each point requires a maximization, so they might come out
         slowly.  For full maximization detail, give the "verbosity plus"
         or "verbosity max" command beforehand.

Return to command index

A1.71 gsd2h2q


Purpose:  Convert esd,gsd,[qsd1] parameters to standard parameters

Usage:                         ;# trait only required for multivariate model
          gsd2h2r [<trait>]    ;# compute h2r from esd,gsd,[qsd1]
          gsd2sd  [<trait>]    ;# compute SD from esd,gsd,[qsd1]
          gsd2h2q [<trait>]    ;# compute h2q1 from esd,gsd,[qsd1]


Note:     Use polygsd command to set up model, and maximize to maximize it
          first, followed by linkgsd for linkage models.

          See the documentation for the polygsd, linkgsd.

Return to command index

A1.72 gsd2h2r


solar::gsd2sd  --
solar::gsd2h2q --

Purpose:  Convert esd,gsd,[qsd1] parameters to standard parameters

Usage:                         ;# trait only required for multivariate model
          gsd2h2r [<trait>]    ;# compute h2r from esd,gsd,[qsd1]
          gsd2sd  [<trait>]    ;# compute SD from esd,gsd,[qsd1]
          gsd2h2q [<trait>]    ;# compute h2q1 from esd,gsd,[qsd1]


Note:     Use polygsd command to set up model, and maximize to maximize it
          first, followed by linkgsd for linkage models.

          See the documentation for the polygsd, linkgsd.

Return to command index

A1.73 gsd2sd


solar::gsd2h2q --

Purpose:  Convert esd,gsd,[qsd1] parameters to standard parameters

Usage:                         ;# trait only required for multivariate model
          gsd2h2r [<trait>]    ;# compute h2r from esd,gsd,[qsd1]
          gsd2sd  [<trait>]    ;# compute SD from esd,gsd,[qsd1]
          gsd2h2q [<trait>]    ;# compute h2q1 from esd,gsd,[qsd1]


Note:     Use polygsd command to set up model, and maximize to maximize it
          first, followed by linkgsd for linkage models.

          See the documentation for the polygsd, linkgsd.

Return to command index

A1.74 gwas


Purpose: Calculates pvalues for snps given a loaded phenotype containing columns
labeled "snp_" or a plink file that contains ids that correspond to the phenotype
and pedigree ids.


Usage: gwas [optional: -plink <plink base file name>  -fix  -list <trait list filename>
		 -np <number of permuations> -precision <h2 decimal count> 
		 -evd_data <base filename output of create_evd_data>
		 -use_covs -batch_size <number of SNPs to computed at once>
		 -screen ]

For single mode

gwas --single-snp <snp name> --list <trait list filename> --precision <h2r decimal count>
 
	The GWAS function performs an examination of a genome-wide set of genetic variants and calculate p-values for 
the association between trait and individual SNPs. The output is saved as gwas.out in the directory of the 
trait.  
Covariates can be factored in by running sporadic_normalize on the trait beforehand.  
 Using a plink file to load the snps into the function is highly recommended for large data sets since the 
file format reads into memory significantly quicker than csv files.  This file type requires a .bed, .fam,
and .bim file to work.  This command's default method performs a full model estimate to compute p-values,
SNP beta values, standard error of beta, h2r for SNP model, and standard deviation.  The permutation 
option performs a quick permutation method to estimate p-values. The fix option will impute 
missing SNP values with a mean so null and EVD data can be reused for all SNP data instead of recalculating it.
Fix is not applicable for permutation method because permutation by default imputes missing SNP values.

In order to use the -list option the SNP data must be loaded from a plink file set and cannot be used
with the permutation method.  

-precision setting changes the precision of h2 from default 6 to the specified value.
-batch_size sets the number of SNPs to be read and computed at the same time.
-use_covs allows for covariates to be included in -fix analysis, covariates are selected through covariate command
-evd_data reads EVD data created through create_evd_data, can only be used with -fix option
-screen runs screen mode which quickly computes an estimate of a beta, beta standard error, and a p-value

 Single SNP mode calculates the GWAS on a list of trait for a single SNP within the loaded phenotype.

 Output is written to gwas.out in the trait's directory, except when -list is used.  In that case
 output is written to the main directory as *-gwas.out.

 If you plan to run gwas with the fix option -fix and have a data set with a very large number of subjects consider
 using create_evd_data prior to running gwas. 

Return to command index

A1.75 h2power


Purpose:  Perform heritability power calculations

Usage:    h2power [-prev] [-grid {<from> <to> <incr>}] [-data <fieldname>]
                  [-nreps <nreps>] [-seed <seed>] [-overwrite] [-plot]
                  [-nosmooth]

          h2power -restart [-grid {<from> <to> <incr>}] [-nreps <nreps>]
                  [-plot] [-nosmooth]


          This command performs a power calculation for the currently
          loaded pedigree, with the following default assumptions:

              (1) the trait to be studied is either quantitative or
                  dichotomous (e.g. affected/unaffected)

              (2) the trait to be studied is influenced by additive
                  genetics

              (3) all pedigree members will be phenotyped for the trait
                  to be studied (unless the -data option is used to
                  exclude those individuals who will not have phenotypic
                  data; see the description of this option below)

          Simulation is used to estimate the frequency with which one
          would expect to obtain a significantly non-zero estimate of
          heritability given that a specified fraction of the phenotypic
          variance is due to additive genetics.  Twice the difference in
          the loglikelihoods of the polygenic and sporadic models is
          asymptotically distributed as a 1/2:1/2 mixture of a chi-square
          random variate with one degree of freedom and a point mass at 0.
          A result is considered significant if the probability of
          obtaining the observed chi-square value, in the absence of a
          genetic effect, is less than or equal to .05.

          The default is to perform 10 replicates of the simulation for
          each heritability in the range .01, .02, .03, ..., .99.  For
          each replicate, a polygenic model is fitted to the simulated
          data, and the resulting heritability estimate and chi-square
          statistic are recorded.  The observed chi-squares are converted
          to power, i.e. the power to detect the corresponding observed
          heritability at a significance level of .05.

          The following options give the user some control over the power
          calculation procedure:

              -prev     If the trait to be studied is dichotomous, SOLAR
                        will assume the existence of an unobserved liability
                        distribution. Individuals with liabilities above
                        some threshold value will be "affected", i.e. they
                        will have the larger of the two trait values (for
                        example, a 1 for a 0/1 trait.) The -prev option
                        is used to specify the "disease" prevalence, or
                        fraction of individuals who are "affected", which
                        in turn determines the liability threshold.

              -grid     Specify the set of heritabilities for which power
                        will be computed. At each grid point, trait data
                        having that expected heritability are simulated,
                        sporadic and polygenic models are fitted to the
                        data, and the loglikelihoods of the models are
                        compared. The observed chi-square test statistics
                        are averaged to obtain the expected chi-square
                        value for that heritability. The grid is given by
                        a set of three numbers enclosed in curly braces:

                            {<from> <to> <incr>}

                        where <from> is the starting heritability, <to>
                        is the last heritability considered, and <incr>
                        is the interval between grid points.  If the
                        desired grid consists of a single effect size,
                        the three-number list can be replaced by that
                        single number and curly braces are not required.

              -data     Exclude individuals from the power calculation
                        who are missing data for phenotype <fieldname>.

              -nreps    Perform <nreps> simulations at each grid point.
                        The default number of replicates is 100.

              -seed     Set the random number generator seed.  The default
                        is to set the seed based on the date and time.

              -plot     At the end of the power calculations, display a
                        plot of power versus QTL heritability.  To display
                        this plot for a previously completed calculation,
                        use the command "plot -h2power".

              -nosmooth By default, the power curve is smoothed by fitting
                        a line through the observed chi-square values as
                        a function of the heritability squared prior to
                        converting the chi-square values to power.  This
                        option turns the smoothing off.

              -overwrite (or -ov)  Overwrite the results of a previous
                                   power calculation.

              -restart (or -r)     Restart a power calculation.


Notes:    It is possible to change the grid of heritabilities and the number
          of replicates when restarting a calculation.  The calculation
          will not be restarted if a grid is chosen that does not include
          all the points in the previously specified grid unless the
          -overwrite option is included, in which case the simulation
          replicates for any extra grid points are discarded.  Similarly,
          the -overwrite option is necessary if fewer replicates are
          requested than were done previously, in which case any extra
          replicates are discarded.

          The following files are created:

              h2power.out   A space-delimited file containing a line for
                            each grid point in the format X Y, which is
                            suitable for input to plotting packages such
                            as xmgr.  The first (or X) column contains the
                            heritability.  The second (or Y) column contains
                            the power.

              h2power.info  Stores the various options selected along with
                            the chi-square statistic, averaged over the
                            replicates, at each grid point.

              h2power.chis  Stores the results of the simulation replicates
                            run at each grid point.  This file, along with
                            h2power.info, is used to restart an interrupted
                            power calculation.

           During a power calculation, various files named "simqtl.*" are
           created along with a trait directory named "simqt". These will
           be removed at the end of the run.

Return to command index

A1.76 help


Purpose:  Display instructional messages

Usage:    help                                ; Displays list of commands
          help <command>                      ; Displays doc for one command
          help -output <filename>             ; Write list to a file
          help <command> -output <filename>   ; Write doc to a file
          help -user                          ; Display user commands

Notes:    See also the doc and usage commands.

          The help for any particular command is quite detailed and may
          cover several pages.  The usage command provides a very brief
          summary of command options, and does not invoke the 'more' pager
          so it will stay on your window while you enter the next command.

          See help for "helpadd" to see how you can add your own help
          messages to the help system.  To display all the user commands,
          give the command "help -user".  To display the help for any
          particular user command, the ordinary "help <command>" will
          work, because it searches user tcl files if <command> is not
          found in SOLAR tcl files.


Shortcuts: hel - help

Return to command index

A1.77 helpadd


Purpose:  Explain how to add more help messages to SOLAR

          SOLAR first looks for a help message for a particular
          commands in the main SOLAR tcl package file solar.tcl.
          Next, it looks for help messages in all *.tcl files in the
          current tcl search paths:

              .            (The current working directory)
              ~/lib        (A lib subdirectory of your home directory)
              $SOLAR_LIB   (the lib subdirectory of the SOLAR installation.
                            This symbol is created by SOLAR.)

          All help messages are commented text blocks beginning with a
          command header and ending with a line containing "#-" or any
          line with no leading "#".

              # <packagename>::<commandname> [--] [private|public]

          The packagename is taken from the filename: it is the segment
          of the filename preceeding the first period (.).  For example,
          the packagename for solar.tcl is "solar" and the packagename
          for john.solar.tcl is "john".

          Each help block should contain at minimum a purpose and usage
          section as shown in example below.

          The help display of all commands available is taken from all
          the commands defined in solar.tcl and other files in $SOLAR_LIB.
          The "help -user" command shows all commands for which help is
          available defined in . and ~/lib.  To block display of this
          command in "help -user" include the word "private" following
          the command name.

          Example:

          File named multiproc.tcl defines a command named mmultipoint.  The
          file is placed in ~/lib and has a help header that looks like this:

          # multiproc::mmultipoint -- public
          #
          # Purpose:   Spawn multipoint jobs per chromosome
          #
          # Usage:     mmultipoint <minlod>
          #-
          
          mmultipoint uses a private procedure mspawn:

          # multiproc::mspawn -- private
          #
          # Purpose:   Launch job on cluster machine
          #
          # Usage:     mspawn <scriptfile>
          #-


Shortcuts: helpa - helpadd

Return to command index

A1.78 hlod


Purpose:  Heterogeneity test for linkage

Usage:    hlod [-step <stepsize>]

          <stepsize> is size of increment in h2q1 and h2r.  Default is
            0.1, 0.05, 0.02, or 0.01 chosen to be about 1/10 of total
            heritability in starting model (with a minimum of
            8 test points plus maximized h2q1 and total heretability).

Notes:    Linkage test model must be loaded at beginning.  Corresponding
          null model will be found in current output directory.

          Complete results are written to hlod.out in maximization output
          directory.

          Linkage model may only have 1 linkage element; null model
          is "polygenic" (null0) model in current output directory.

          H0 vs. H1 test is only considered significant if p < 0.0001
          (LOD 3).  If this is not significant, there will be a warning
          that the H1 vs H2 test makes no sense.

          hlod uses "homo" program written by Harald Goring for which
          documentation is provided in Appendix 5 of the documentation
          (use "doc" command to browse, click on Table of Contents).


Return to command index

A1.79 house


Purpose:  Enable analysis of household effects

Usage:    house            ; enable analysis of household effects
          house -suspend   ; temporarily suspend household effects
                           ; (the c2 parameter is constrained to zero)
          house -delete    ; delete household effects (c2 is deleted)

Examples:

 ** new model ** Note you must give house command after trait or model new

        solar> model new
        solar> trait weight
        solar> covar age^1,2#sex
        solar> house                       ; activates h. effects
        solar> polygenic -screen           ; polygenic analysis w. screening

** old model ** 

        solar> load model poly
        solar> house
        solar> polygenic

Notes:  This may be used for any common environmental effects (not
        necessarily "household" effects).  The house command changes the
        current model as follows:

            1) A parameter c2 (c stands for "common") is created
            2) The house matrix is loaded
            3) A c2*house term is added to the omega
            4) The c2 parameter is added to the e2 + ... = 1 constraint
            5) The starting value of c2 is carved away from the value of
               e2 so that the constraint remains satisfied

        The pedigree file must contain a HHID field.  If so, the
        'load pedigree' command produces a matrix named house.gz.
        That matrix will be used.  If house.gz is not present, this
        command will fail, although you can map HHID to any particular
        field in your pedigree file using the "field" command.

        WARNING!  If you load a pedigree without a HHID field (or a field
        mapped to it with the field command) a pre-existing house.gz, now
        assumed to be obsolete, will be deleted.  This is to prevent you
        from making the mistake of using an obsolete house.gz.

        HHID can be a number or an alphanumeric name (with no internal
        spaces or tabs) but the number 0 (zero) has a special meaning.
        Zero indicates singleton households--each individual with HHID
        of zero is a separate household, not associated with other
        individuals having HHID of zero.  Blank or null has the same
        effect as zero.

        The 'house' command should be specified after such commands as
        automodel, trait, polymod, or spormod (which initialize polygenic
        or sporadic models) and/or just before commands which maximize
        models such as 'polygenic,' 'maximize,' or 'multipoint.'  This
        is because "polygenic" or "sporadic" models, by definition,
        do not have household effects.  But the polygenic command will
        do the household "analysis" if it detects the presence of a c2
        parameter which is not constrained to zero.

        We define the following model classes:

        sporadic  (covariates only...and e2)
          household (covariates and c2)
          polygenic (covariates and h2r)
            household polygenic (covariates, c2, and h2r)

        To create a pure "household" model with no active genetic component,
        give the commands "spormod" and "house" in that order after setting
        up the trait(s) and covariate(s).

        By default, if a household element is in the model, pedigrees will
        be merged whenever individuals in separate pedigrees share the same
        household.  The resulting groups are called "pedigree-household"
        groups.  This may significantly increase memory requirements.
        Pedigree merging is controlled by two SOLAR options (see the
        option command).  The default of 1 for MergeHousePeds means that
        pedigree merging, as just described, will be done.  This feature
        may be changed by setting MergeHousePeds to zero prior to the
        polygenic or maximize commands:

        solar> option mergehousepeds 0
        solar> polygenic -screen

        The MergeAllPeds option combines all the pedigrees into one large
        group if set to 1.  This is an alternative simpler method of
        merging, but it may increase memory requirements even more.
        


Shortcuts: hou - houses

Return to command index

A1.80 howmanyranch


Purpose: Show how many ranch machines a user is using

Usage:  howmanyranch <userid>

Notes:  See also whoranch.  Requires use of Grid Engine software.

Return to command index

A1.81 ibd


Purpose:  Compute marker-specific IBDs.

Usage:    ibd [-nomle] <marker> [<marker> ...]
                                    ; computes IBDs for specified markers
          ibd [-nomle]              ; computes IBDs for all markers
          ibd mito                  ; computes mitochondrial IBDs

          ibd export [-file <filename>] [-overwrite] [-append]
                     [-nod7] [-ibdid] [<marker> ...]
                                    ; writes IBDs for specified markers
                                    ; to a file in comma delimited format

          ibd import [-file <filename>] [-nod7] [-ibdid] [<marker> ...]
                                    ; reads IBDs for specified markers
                                    ; from a file in comma delimited format

          ibd prep <program> [-version 2.82] [<marker> ...]
                                    ; prepares input files needed to compute
                                    ; IBDs using <program>, where <program>
                                    ; is loki, simwalk (sw), merlin, or
                                    ; genehunter (gh)

          Before any ibd command can be run, the directory in which to
          store the IBDs must be specified with the 'ibddir' command.  This
          specification is stored in a file ibddir.info in the working
          directory, so it need not be repeated in future sessions from
          the same working directory.

          The first record in all matrix files produced by SOLAR, including
          IBD matrix files, is a checksum and not real data; see matcrc
          command for details.  This checksum is optional in user created
          matrix files. If present, it prevents a using matrix with a 
          different or changed pedigree.
          
          In the absence of prior knowledge of marker allele frequencies,
          it is recommended that the 'freq mle' command be used to compute
          maximum likelihood estimates of the allele frequencies.  This
          will improve the accuracy with which missing genotypes are imputed
          in the IBD computation process.  IBDs will not be computed for
          markers with simple-count allele frequencies generated by the
          'load marker' command, i.e. MLE allele frequencies are required
          when prior frequency data is not available.  To compute IBDs
          using the simple-count allele frequencies instead of MLEs,
          specify the -nomle option.  Alternatively, the NoMLE option can
          be set using the 'ibdoption' command.

          The method used to compute marker-specific IBDs will depend on
          the family structure and will be selected automatically.  It
          is possible to choose the Monte Carlo method regardless of the
          automatic selection by using the 'ibdoption' command.  For
          performance reasons, the Monte Carlo method will be used
          automatically for completely-typed markers.

          Mitochondrial IBDs are a special case.  Each pair of individuals
          who share a common maternal lineage, i.e. who have inherited the
          same mitochondrial DNA, will be assigned an IBD value of 1, while
          all other pairs are assigned an IBD value of 0.  The necessary
          information is completely contained in the pedigree data.  Hence,
          there is no mitochondrial marker data to load, nor are allele
          frequencies required.

          The 'ibd export' command outputs the IBDs for a specified set
          of markers into a comma delimited file.  The IBDs must be stored
          in the directory named in the 'ibddir' command.  If no marker
          names are given, then all IBDs found in the 'ibddir' directory
          are exported.  By default, the SOLAR indexed IDs (IBDIDs) in the
          IBD files are translated to permanent IDs, and family IDs are
          included when present in the pedigree file.  The default name for
          the output file is "solar-ibd-export.out".  The default fields in
          the output file are MARKER, [FAMID,] ID1, ID2, IBD, and D7.

          The options for the 'ibd export' command are

              -file (or -f)         Export IBDs to this filename.

              -overwrite (or -ov)   Overwrite existing output file.

              -append (or -a)       Append IBDs to existing output file.

              -nod7                 Don't include D7 field from IBD files.

              -ibdid                Write out SOLAR indexed IDs (IBDIDs)
                                      rather than permanent IDs.

          The 'ibd import' command inputs the IBDs for a specified set of
          markers from a comma delimited file.  IBD files are written and
          stored in the directory named in the 'ibddir' command.  If an
          IBD file for an imported marker already exists, it is overwritten.
          By default, the permanent IDs in the input file are translated
          to SOLAR indexed IDs (IBDIDs).  Family IDs must be included in
          the input file when they are present in the pedigree file.
          The default name for the input file is "solar-ibd-import.in".
          The default fields in the input file are MARKER, [FAMID,] ID1,
          ID2, and IBD.  If the input file does not contain a D7 field,
          all D7 values in the IBD files are set to zero.  By default,
          all IBDs in the input file are imported.  However, if markers
          are specified on the command line, then IBDs are imported for
          those markers only.  If one and only one marker is specified on
          the command line, a MARKER field is not required in the input
          file.  The order of the markers in the input file is unimportant,
          but all the lines for a given marker must be adjacent.  Unless
          there is inbreeding in the pedigree file, checks are made to
          ensure that imported IBDs for parent-offspring and self pairs
          are correct (0.5 and 1, respectively).  An option is provided
          to make the parent-offspring error checking appropriate for
          X-linked markers.  Checks are also made to ensure that imported
          IBDs for unrelated individuals are equal to 0.

          The options for the 'ibd import' command are

              -file (or -f)         Import IBDs from this filename.

              -nod7                 Don't take D7 from input file; set D7
                                      to zero instead.

              -ibdid                Input file contains SOLAR indexed IDs
                                      (IBDIDs) rather than permanent IDs.

              -xlinked              Use error checking appropriate for
                                      X-linked markers

Notes:    The computed IBDs are stored in gzipped files with names of the
          form 'ibd.<marker>.gz', where <marker> is the marker name.  All
          working files created during the IBD computation process will be
          stored in the marker-specific subdirectories created by the
          'marker load' command.

          Mitochondrial IBDs are stored in the gzipped file 'ibd.mito.gz'.
          If a marker exists in the marker data that is named 'mito', the
          same file name will be used to store the IBDs for that marker.
          Hence, the marker name 'mito' should not be used if you intend
          to use the 'ibd mito' command.


Shortcuts: ibd - ibd

Return to command index

A1.82 ibddir


Purpose:  Set directory in which IBD matrix files are stored
           (twopoint only; use mibddir to set up multipoint)

Usage:    ibddir <dirname>     ; set director for IBD files
          ibddir               ; show current ibddir
          ibddir -session      ; show ibddir entered in this session

Notes:    The ibddir selected is saved in file ibddir.info for
          future SOLAR sessions.  Once a midddir is selected, it
          need not be selected again within the same working directory,
          EXCEPT for the purposes of writing out ibd files.  To
          prevent accidentally overwriting pre-existing ibd files,
          it is necessary to explicitly enter the ibddir
          command before using the ibd command or other commands
          which write files into the ibddir.

Shortcuts: ibdd - ibddir

Return to command index

A1.83 ibdoption


Purpose:  Set or display IBD processing options.

Options:  XLinked   select this option to load X-linked marker data
          NoMLE     if this option is chosen, MLE allele frequencies are not
                    required for IBD calculation
          MCarlo    if this option is chosen, the Monte Carlo method will be
                    used to calculate IBDs
          MibdWin   size (in cM) of the multipoint IBD window - the MIBDs at
                    a given chromosome location depend only on markers inside
                    or on the boundary of the window centered at that location

Usage:    ibdoption                   ; displays current IBD options

          ibdoption xlinked           ; toggles the XLinked option
          ibdoption xlinked <y/n>     ; sets the XLinked option
          ibdoption xlinked ?         ; displays the current setting of XLinked

          ibdoption nomle             ; toggles the NoMLE option
          ibdoption nomle <y/n>       ; sets the NoMLE option
          ibdoption nomle ?           ; displays the current setting of NoMLE

          ibdoption mcarlo            ; toggles the MCarlo option
          ibdoption mcarlo <y/n>      ; sets the MCarlo option
          ibdoption mcarlo ?          ; displays the current setting of MCarlo
          ibdoption mcarlo # <num>    ; sets number of imputations
          ibdoption mcarlo #          ; displays number of imputations
          ibdoption mcarlo max <y/n>  ; choose max risk for first imputation?
          ibdoption mcarlo max ?      ; displays max risk option

          ibdoption mibdwin           ; displays the multipoint IBD window size
          ibdoption mibdwin <size>    ; sets the multipoint IBD window size


Shortcuts: ibdo - ibdoptions

Return to command index

A1.84 ibs


Purpose:  Compute marker-specific IBS matrices.

Usage:    ibs <marker> [<marker> ...]
                                    ; computes IBSs for specified markers
          ibs                       ; computes IBSs for all markers

          Before the ibs command can be run, the directory in which to
          store the IBSs must be specified with the 'ibddir' command.

Notes:    The computed IBSs are stored in gzipped files with names of the
          form 'ibs.<marker>.gz', where <marker> is the marker name.


Shortcuts: ibs - ibs

Return to command index

A1.85 if_global_exists


Purpose:  Check if a Tcl global variable exists

Usage:    if_global_exists <global_name>

          Returns 1 if global exists, 0 otherwise.

Notes:    This is used in scripts in an "if" statement.  For example:

          if {[if_global_exists SPECIAL_CASE]} {
              global SPECIAL_CASE
              puts "This is case $SPECIAL_CASE"
          }

          You do not need to declare the variable "global" before
          calling if_global_exists.  However, you will need to declare it
          global before setting or using it in a script.  Note that all
          variables declared at the interpreter level (at the solar>
          prompt) are automatically global.  Global variables should
          not be confused with "shell" variables such as SOLAR_BIN
          (though, all shell variables may be found in the global
          array "env", for example, $env(SOLAR_BIN)).

          Global variables are a convenient way of passing variables
          through many levels of procedure calls without rewriting all
          the intervening procedures, or across commands on an ad hoc basis.
          Use of global variables is considered "bad style" by programming
          purists and other bores.  But if they're so smart, why aren't they
          writing your program?  It is true, however, that use of global
          variables can sometimes introduce bugs and other unintended
          consequences.

          Globals variables prefixed with SOLAR_ are reserved for use by
          the standard SOLAR procedures defined in solar.tcl.  But solar.tcl
          might also use unprefixed globals, so it is recommended that users
          use their own unique prefix to be safe.

See Also: remove_global

Return to command index

A1.86 if_parameter_exists


Purpose:  Check if a parameter exists without creating it

Usage:    if_parameter_exists <parameter_name>

          Returns 1 if parameter exists, 0 otherwise.

Notes:    This is used in scripts in a "if" statement.  For example:

          if {[if_parameter_exists h2q1]} {
              constraint e2 + h2r + h2q1 = 0
          }

Return to command index

A1.87 inormal


Purpose:  Save inverse normal transformation to a file (see also define)

IMPORTANT: To create a model using an inverse normal transformation,
           it is more convenient to use the "define" command, and NOT
           the inormal command.  The "inormal" command itself is for
           those rare situations where you need to save the inverse
           normal transformation to a file for some other purpose.

Usage:    define <defname> = inormal_<phenotype>
          trait <defname>

          inormal -trait <trait> [-file <filename>] -out <filename>
                  [-phenfile] [-nofamid] [-class <class>]

          -class <class> only include ID's when their class variable equals this value

          (See notes below for obscure forms of the inormal command not
           recommended for most users.)

Notes:    For the "define" command, the <defname> can be any name you
          can make up.  The inormal_ prefix may be abbreviated down to inor_ .
          The <phenotype> is any phenotypic variable in the currently
          loaded phenotypes file.

          For the "inormal" command itself, you must use one of the
          arguments "-phenfile" or "-file <filename>".  The "-phenfile"
          argument is a shorthand way of specifying the currently loaded
          phenotypes file.  The "-file <filename>" argument is used to
          specify any file.  In either case, the file must be in the form
          of a phenotypes file, with fields for <trait> and ID (and FAMID
          if required to make the ID's unique).  BE SURE TO SPECIFY THE
          "-out" ARGUMENT FOR THE OUTPUT FILE.

          The inverse normal transformation of a dataset is performed
          by the following procedure:

            The trait values are sorted, and for any value V found
            at position I in the sorted list, a quantile is computed
            for it by the formula I/(N+1).  The inverse normal
            cumulative density function (see "normal") is computed for
            each quantile and stored in an array keyed by ID, and 
            FAMID if applicable.  When the value V occurs multiple times,
            the inverse normal is computed for each applicable quantile,
            averaged, then the average is what is stored for each ID.
            These values are accessed when the ID is provided.  The
            array for each trait is deleted by the -free option.
          
          See also the "normal" command, which computes normal distribution
          functions.  inormal uses a "normal -inverse" command.

          OBSCURE FORMS OF THE INORMAL COMMAND

          Largely for internal purposes, such as the implementation of
          the define command, there are additional obscure forms of the
          inormal command which save the inverse normal results in
          a tcl variable for access on an individual ID basis:

          inormal -trait <trait> [-file <filename>] -tclvar
                  [-phenfile] [-nofamid] 
          inormal -trait <trait> -id <id> [-famid <famid>]
          inormal -free <trait>
          inormal -reset

          The first form above is like the standard form, except that the
          -out argument is replaced with a -tclvar argument, signifying
          that the inverse normal results are to be saved to a Tcl variable
          associated with the trait name.  In the second form, a result is
          obtained from previously stored data for each ID.  In the third
          form, stored data for a particular trait is freed.  In the
          fourth form, all stored data is freed.

          The -out and -tclvar arguments cannot be used at the same time.
          If the -out argument is used, inverse normals are simply written
          to a file and nothing is stored, so the second form cannot be
          used.

          FAMID should only be specified if required.
          The rules regarding FAMID are almost identical with
          those used during maximization, so that in general you don't
          have to think about them.  If FAMID field is found in both
          pedigree and phenotypes files, or if pedigree file isn't loaded
          (which wouldn't be allowed during maximization) and FAMID is
          found (only) in phenotypes file,  FAMID is automatically required,
          unless the -nofamid argument is used.  If FAMID is found in
          only one of the two files (and both are loaded), a test for
          ID uniqueness is performed, then if ID's are unique without FAMID,
          it is not required, otherwise FAMID is required and if not present,
          it is an error.  FAMID can be mapped to any other field name using
          the field command.

          When using these obscure forms of the inormal command, it is
          recommended to load the data and then use it in short order,
          even though the inormal command doesn't intrinsically require
          this.  Internal "inormal" data is not saved from one SOLAR
          session to the next.

          BEWARE that "maximize" or any SOLAR command that performs
          maximization, such as "polygenic" or "multipoint", may clear
          out inverse normal data stored using -tclvar.  Also, if
          different layers of procedures get inormals on traits
          with the same name from different files, and their inormal
          operations overlap, there
          could be problems.

          When the -class option is used, the traitname
          is qualified with a suffix like .SOLARclass.1 (where 1 is the class
          number).  To free such a classed trait, the fully suffixed name must
          be used.  For example, for trait q4 and class 1, the command would be:

              inormal -free q4.SOLARclass.1

Return to command index

A1.88 interval


Purpose:  Set cM interval and range for multipoint scanning each chromosome

Usage:      interval <count> <range> ; set increment count and range
            interval <count>         ; default range is 0-* (* means last)
            interval                 ; displays current setting

Examples:   interval 5               ; Check every 5 cM
            interval 1 101-109       ; Check every 1 cM between 101 and 109
            interval 10 200-*        ; Check every 10 cM after <200 cM>
            interval 0 100           ; Check at position <100 cM>
            interval -5 *-100        ; Check every 5 cM from last to 100

Shortcuts: interv - intervals

Return to command index

A1.89 intraitclass


purpose: set up trait with classwise inormalization

Usage:  intraitclass <traitname> [<class>]+

Example: trait averagefa 0 2 3

Notes: Do model new before using this command.

Return to command index

A1.90 invert


invert from version 6 has been renamed "transpose" in version 7
see "help transpose" for more information

Return to command index

A1.91 is_nan


Purpose:  Check if value is NaN (Not a Number)

Usage:    is_nan <number>

          Returns 1 if number is NaN, 0 otherwise.

Notes:    This is most useful in scripts, when getting the likelihood or
          other value using read_model, you should check to be sure it
          is not NaN due to maximization convergence error.

Return to command index

A1.92 joinfiles


Purpose:  Join files horizontally based on ID's

Usage:    joinfiles [-all] [<filename>]* [-out <filename>] [-key <keylist>]
                    -list <filename> -chunk <chunksize>

          -out <filename>  Write joined records to this file (default is
                           joinfiles.out in the working directory)
          -key <keylist>   Base join on these key(s) (default is ID or EGO,
                           and also FAMID if FAMID is present in all files)
          -all             Filenames may be patterns with wildcards
                           (including *  to match any sequence of characters
                           and ? to match any one character) and/or names of
                           directories whose files will be included.
                           (Files in subdirectories are not included.)
                           When using -all, no system limit on open files
                           is applicable.
           -list <filename> Include all files listed in <filename>, which
                            has one filename in each line, which may be
                            a pattern with wildcards.  Only one -list
                            may be used.  When using -list, no system limit
                            on open files is applicable.
            -chunk <chunksize>  The chunk size used in joining files under
                                -all and -list options.  By joining only one
                                chunk of files at a time, the system limit
                                on open files is bypassed.  The default is
                                100.

          Some additional esoteric options are described below in Note 7.

Notes:

1)  Each file may either be Comma Delimited or Pedsys, and sucessive files
    may use different formats.

2)  The output file will be Comma Delimited, thus this command also serves
    to translate one or more Pedsys files to Comma Delimited format.

3)  Any field mapping of ID and FAMID to some other name through the 
    "field" command will be applied if the keys are defaulted.  Key
    matching is case insensitive, so the key might be "ID" in one file
    and "id" in the next.

4)  Records will be output for every ID encountered regardless of whether
    that ID is found in all files.

5)  If keys are specified, you'd better know what you are doing.
    No field name mapping or testing of whether FAMID is required
    will be done.  However, whether defaulted or not, the availability
    of keys in every file will be tested.

6)  If the same filename is repeated in the list of files to join, the
    repeats are ignored (for technical reasons).  If you must join the
    same file to itself for some legitimate reason (???), copy
    to a another filename first.

7)  If the same field name(s), other than the key(s), are found in more
    than one file, the default action is to rename them in the output
    file in a way so as to be unique.  The following format is used:

    <field name>.<filename>[.<number>]

    If adding the filename makes the field name unique, that is all that
    is done, which makes for a nice new name.  For example:

        q4.qaw10.phen    (phenotype q4 in file gaw10.phen)

    Otherwise, <number> is applied, using the first number (starting
    from 2 and ending with 30002) that makes the field name unique.
    Unless there are more than 30000 matching field names, this will
    guarantee that a unique name will be found and  used.  Also, 
    with reasonably short names, it is likely that the resulting name
    will be unique within 18 characters, which is currently required
    for trait and covariate names.  However, uniqueness within 18 
    characters is not guaranteed as that would require ugly renaming
    and it's quite possible the 18 character limit may be removed
    eventually anyway.  Uniqueness testing is case insensitive.

    There are two other optional ways of handling field names which are
    not unique.  These option specifiers may be used anywhere after the
    command name but apply globally to all files.

          -uniqueonly    Remove fields which are not unique among files
                         (except keys).

          -norename      Don't rename fields that are not unique, just
                         include them.  (Note: If this option is applied,
                         the resulting file may cause problems with
                         various SOLAR commands.  For example, the
                         "residual" command won't like it even if the
                         duplicated field is NOT used as a trait or
                         covariate.)

8)  If the same fieldname is repeated in one file, that field is
    not included in the output.  (Such fields could not be selected
    as traits or covariates in SOLAR anyway.)  This typically occurs
    when there is a field named BLANK to separate columns in a Pedsys
    file.  Also, fields with the "null" name (zero characters or all
    blanks) are not included.

Return to command index

A1.93 key


Purpose:  Make user key

Usage:    key <make> <username>

Note:     This is intended for use only by authorized personnel.


Shortcuts: key - key

Return to command index

A1.94 linkgsd


solar::linkqsd0 --
solar::linkqsd --

Purpose:  Set up linkage model with esd, gsd, qsd parameters (EXPERIMENTAL)

Usage:  linkqsd <path>/<mibdfile>
        linkqsd0 <path>/<mibdfile>

Example:  model new
          trait q4
          covar age sex
          polygsd
          maximize
          gsd2h2r
          chromosome 9 10
          interval 5
          mibddir gaw10mibd
          multipoint -link linkqsd0 -cparm {esd gsd qsd}

Notes:  Polygenic parameters must already have been set up (use the
        polygsd command).  Prefereably it should have been maximized
        also (use the maximize command).

        linkqsd modifieds the model currently in memory.  linkqsd0 assumes
        the existance of a null0 model in the maximization output directory,
        loads that, and then adds the linkage element.

        We have not worked out suitable heuristics to force maximization
        of difficult models, so at the present time this parameterization
        is not as robust as our standard parameterization.

Return to command index

A1.95 linkmod


Purpose:  Set up parameters and constraints for multipoint linkage model

Usage:    linkmod [-add [-epistasis]] [-zerostart] [-2p] <ibdfile>
                  [-cparm] [-se]

                   -add means add this to previous linkage elements
                      (otherwise, it supercedes the previous one)
                   -zerostart is used for score analysis (see below)
                   -2p  twopoint (ibd not mibd)
                   -epistasis sets up epistasis parameter between new
                        element and previous linkage element.  Use with
                        "-add".  Not supported for bivariate.
                   -se     Turn on standard error calculation option.
                           The default is to turn it off.
                   -cparm  "Custom Parameterization"  Simply replace old
                           matrix with new matrix.  Parameters, constraints,
                           and omega are unchanged.  A "prototype" model
                           with suitable matrix, parameters, omega, and
                           constraints must already be loaded.  See
                           final note below for more information.
                           Note: if -cparm is specified, standard errors
                           are NOT turned off, but left in whatever state
                           they were in when linkmod was called.

Notes:    Use the -2p option for twopoint models.  linkmod2p is now
          obsolescent (linkmod -2p is invoked when you give the linkmod2p
          command).

          A polygenic or linkage model should already have been
          created and maximized first.  Boundaries are set around existing
          values assuming this has been done.

          Multiple linkage terms will be included if Solar_Fixed_Loci
          is defined.  The script multipoint does this.

          By default, standard error is turned off.  You may turn it on
          again by giving the command 'option standerr 1' after running
          linkage and before running maximize.

          The -zerostart option starts the new linkage component at 0.
          (The linkage component MUST NOT HAVE BEEN ALREADY CREATED!)
          This is used for score test analysis.

          The -cparm option requires that a prototype linkage model
          with all required matrices, parameters, omega terms, and
          constraints be already loaded.  Other than that, however, it
          ATTEMPTS to be as general as possible.  However, it is
          necessary to make one assumption regarding the name of
          the first matrix.  If the -2p option is specified, the
          relevant matrix that will be modified must be named
          ibd or ibd1, ibd2, etc.  Otherwise, the relevant matrix
          must be named mibd or mibd1, mibd2, etc.  It is the
          ibd or mibd matrix with the highest number, if any,
          which will be replaced.  If a second matrix column such 
          as d7 or delta7 is included, it will be assumed to be
          included in the replacement matrix as well.  This option
          is used by "multipoint -cparm" and "twopoint -cparm".

Shortcuts: linkm - linkmodel

Return to command index

A1.96 linkqsd


Purpose:  Set up linkage model with esd, gsd, qsd parameters (EXPERIMENTAL)

Usage:  linkqsd <path>/<mibdfile>
        linkqsd0 <path>/<mibdfile>

Example:  model new
          trait q4
          covar age sex
          polygsd
          maximize
          gsd2h2r
          chromosome 9 10
          interval 5
          mibddir gaw10mibd
          multipoint -link linkqsd0 -cparm {esd gsd qsd}

Notes:  Polygenic parameters must already have been set up (use the
        polygsd command).  Prefereably it should have been maximized
        also (use the maximize command).

        linkqsd modifieds the model currently in memory.  linkqsd0 assumes
        the existance of a null0 model in the maximization output directory,
        loads that, and then adds the linkage element.

        We have not worked out suitable heuristics to force maximization
        of difficult models, so at the present time this parameterization
        is not as robust as our standard parameterization.

Return to command index

A1.97 linkqsd0


solar::linkqsd --

Purpose:  Set up linkage model with esd, gsd, qsd parameters (EXPERIMENTAL)

Usage:  linkqsd <path>/<mibdfile>
        linkqsd0 <path>/<mibdfile>

Example:  model new
          trait q4
          covar age sex
          polygsd
          maximize
          gsd2h2r
          chromosome 9 10
          interval 5
          mibddir gaw10mibd
          multipoint -link linkqsd0 -cparm {esd gsd qsd}

Notes:  Polygenic parameters must already have been set up (use the
        polygsd command).  Prefereably it should have been maximized
        also (use the maximize command).

        linkqsd modifieds the model currently in memory.  linkqsd0 assumes
        the existance of a null0 model in the maximization output directory,
        loads that, and then adds the linkage element.

        We have not worked out suitable heuristics to force maximization
        of difficult models, so at the present time this parameterization
        is not as robust as our standard parameterization.

Return to command index

A1.98 load


Purpose:  Load a user data file (pedigree, phenotype, marker, etc.)

Usage:    load <object-type> [<options>] <arguments>

          load pedigree <filename>
          load phenotypes <filename>
          load matrix [-sample | -allow] <filename> <name1> [<name2>]
          load matrix [-cols <tcl-list>] <filename> ;# MathMatrix
          load model <filename>
          load freq [-nosave] <filename>
          load marker [-xlinked] <filename>
          load map [-haldane | -kosambi] <filename>

Notes:    There is much more information available for each variant of
          the load command.  See the documentation for the
          particular object type, for example, "help pedigree".
          For information about a particular file format, see the
          applicable file-* documentation, for example, "help file-pedigree".


Shortcuts: load - load

Return to command index

A1.99 load_house


Purpose: Loads house file when using empirical pedigree
         
 This command loads household matrix to be used with empirical 
 pedigree. The matrix is expected to be in a two-column csv format as follows.
 Id, HouseID

 Where the ID column should match the IDs of the empirical pedigree file. 
 HouseID is a sequential ID that marks subjects from the same house hold. 
 This file can include only the subjects for whom household information is 
 available.  Subjects omitted from that file will be assumed from independent
 households. For example

 ID,HouseID
 A101,1
 A102,1
 A103,1
 A121,2
 A122,2
 A123,2

 Will code a matrix where subjects A101-3 live in the same household and 
 subjects A121-3 form another household. 

Return to command index

A1.100 loadkin


Purpose:  Load a matrix named phi2.gz containing phi2 and delta7

Usage:    loadkin

Notes:    If the file phi2.gz does not exist, this command will be
          silently ignored.  This command is mainly for scripts.  You
          can perform the same action with a known matrix file with:

          matrix load phi2.gz phi2 delta7

Shortcuts: loadk - loadkinship

Return to command index

A1.101 lod


Purpose:  Calculate LOD score

Usage:    lod [<test-loglike> <null-loglike>] [<options>]
          options := [-auto|-off|-raw] [-trait <N>] [-rhoq <N>] [-v]
                     [-1t|-2t|-3t|-4t|-t1|-t2|-t3|-t4] [-nolodadj]

          If no likelihoods are specified, the likelihoods of the
          "current" model and the applicable "null" model are used.

          -auto        Convert multivariate LOD to 1df effective LODs based
                         on number of traits in current model and constraint
                         of relevant rhoq's (default)
          -off         Do not convert LODs to 1df effective
          -raw         Do not perform LOD conversion or lodadj
          -traits <N>  Convert multivariate LOD to 1dF assuming <N> traits
          -1t or -t1   Assume 1 trait (same as "-traits 1")
          -2t or -t2   Assume 2 traits (same as "-traits 2")
          -3t or -t3   Assume 3 traits (same as "-traits 3")
          -4t or -t4   Assume 4 traits (same as "-traits 4")
          -rhoq <N>    Convert multivariate LOD to 1df assuming <N>
                         constraints of relevant rhoq's
          -nolodadj    Do not perform lod adjustment (lodadj)
          -v           verbose: Show adjustment and conversion steps

Examples: outdir test1
          load model test1/null1
          lod
          lod -v
          lod -2000.51 -2030.87
          lod -trait 3 -rhoq 1 -v -2000 -2030
          lod -raw -2000 -2030

Notes:    If no likelihoods are specified, the current model must have
          been maximized through a command such as "maximize," "twopoint",
          or "multipoint", and the applicable null model should be saved as
          nullX.mod (e.g. null0.mod, null1.mod) where X is the number
          of active linkage elements, which is assumed to be one less
          linkage element than in the current model.  Linkage elements are
          parameters named h2q1, h2q2, etc.  The null model must have
          been saved in the maximization output directory, either named
          after the trait or set by the outdir command.

          By default, SOLAR provides easily interpreted "1 df effective" LODs
          which are equivalent to those in univariate models.
          However, you can also have complete control over the LOD
          conversion performed either using arguments here or
          preferences set globally with the lodp command.  Options
          specified here override the defaults and lodp preferences.

          The correction of 2 trait LODs to 1dF effective LODs is based
          on this formula: the LOD is converted to chi square with
          1/2 1df, 1/4 3df, and 1/4 point mass at zero.  If rhoq is
          constrained, the formula is 1/2 1df, 1/4 2df, and 1/4
          point mass at zero.  This is then converted to a 1/2 1df
          chi square of equivalent p-value, which is divided by
          2ln10 to get the 1df effective lod score.

          The correction of 3 trait LODs to 1dF effective LODs is based
          on the formula: the LOD is converted to chi square with
          3/8 1df, 3/8 3df, 1/8 6df, and 1/8 point mass at zero.
          For each rhoq constrained, the 6df is changed downward
          by 1df.

          The conversion of higher multivariate LODs follows a similar
          expanding sum.  If you wish to see the weights used, use the
          lod command with the -v option.

          Empirical LOD adjustment, if any, is automatically applied (see
          the lodadj command) unless the -raw option is used.  Unless you
          specify -raw, SOLAR will need to search the output directory for
          a lodadj.info file, which means that a trait or outdir must
          have been selected.

          Empirical LOD adjustment is not yet supported for bivariate
          models.  The lodadj value is ignored when bivariate LODs are
          computed, and, in the cases where the lodadj value would be
          shown (such as in the multipoint.out file, or if lod is called
          from the command prompt) a warning message is shown instead.

          In SOLAR version 3.0.2, the "clod" and "lod" commands were
          combined into a new "lod" command.  The options allowed
          have changed compared with the earlier "clod" ; the original
          "lod" command did not allow any arguments.

          Use the "lodn" command if you the current model may not use
          the "h2q1" linkage parameter and you are not specifying
          loglikelihoods explicitly.
          
          See also lodn, lodp, lodadj.

Shortcuts: lod - lod

Return to command index

A1.102 lodadj


Purpose:  Use or calculate an empirical LOD adjustment

Usage:    lodadj [-calc] [-off] [-null <N>] [-nreps <#replicates>] [-restart]
                 [-restorenull0] [-query] [-summary]

             lodadj   If no arguments are given, this turns ON the
                      previously calculated empirical LOD adjustment
                      for the current trait/outdir.  This value is
                      stored in a file named lodadj.info if currently
                      ON or lodadj.off if currently OFF.
                      It is an error if the null0 model has a later
                      timestamp than the lodadj.info file.  (You can
                      update the timestamp of the lodadj.info file with
                      the Unix "touch" command if you are sure it is OK.)
    
            -off      Turn OFF empirical LOD adjustment.

            -query    Return the LOD adjustment currently in effect
                      (1.0 if none).

            -calc     Calculate and use a new empirical LOD adjustment.
                      (This requires an existing null0.mod file from the
                      polygenic command.)  The adjustment is turned ON.

            -null     Use an existing nullN model instead of null0.

            -nreps    Number of replicates.  In each replicate, a
                      fully-informative marker, unlinked to the trait,
                      is simulated, IBDs are calculated for this marker,
                      and a LOD is computed for linkage of the trait
                      to this marker.  The default number is 10000.

            -restart  (or -r) Perform additional replicates, adding the
                      LODs to the set of previously computed LODS, until
                      the total number of replicates (old and new) reaches
                      the number specified by the -nreps argument.  The
                      same null model is used as in the previous replicates;
                      the -null argument is ignored if present.

            -cutoff   Specify the fraction of the highest observed LODs
                      that will not be used to compute the empirical LOD
                      adjustment.  For example, if the cutoff is .01, then
                      the largest 1% of the observed LODs will be ignored
                      when the LOD adjustment is calculated.  The default
                      cutoff is .05.

            -overwrite       (or -ov) Recalculate the empirical LOD
                             adjustment.  Existing LOD adjustment output
                             files in the trait/outdir will be overwritten.

            -restorenull0    Restore the null0 model in effect at the time
                             the last empirical LOD adjustment was
                             calculated.  This will overwrite a later
                             null0 model.

            -summary  Display a summary of the LOD adjustment calculation.
                      The summary shows the distribution of the original
                      and adjusted LOD scores, the number of replicates
                      performed, and the name of the null model.

Notes:     The -calc option produces output files in the trait/outdir:
           lodadj.out, lodadj.lods, and lodadj.info.  lodadj.out contains
           summary information, lodadj.lods contains the raw actual vs.
           theoretical LODs, and lodadj.info contains state information
           including the null model currently in effect.

           The lodadj value and state (on or off) is saved in each
           trait/outdir (by the lodadj.info or lodadj.off file).  This
           is now preserved when restarting SOLAR.

           lodadj is now supported for bivariate lods.  Since the
           correction is always computed with one additional degree of
           freedom, the lodadj adjustment is applied AFTER the
           lod correction to one degree of freedom, and the user is
           advised not to disable the one degree of freedom correction
           with the lodp command.

Return to command index

A1.103 lodn


Purpose:  Calculate LOD score for current model relative to nullX

Usage:    lodn X <options>

               X   Number indicating index of relevant null model (for
                   example, 0 for null0, the model having no linkage
                   elements).
               <options>  See "help lod".

Notes:    In many cases you can more easily use the "lod" command, which
          determines the applicable null model automatically, or, you can
          specify the loglikelihoods.  "lodn" may be useful if you
          are not sure whether the current model contains h2q parameters,
          for example, if it includes a custom parameterization.

          The current model must have been maximized, either through the
          "twopoint" or "multipoint" command, or directly with the
          "maximize" command.

          The null model should be saved as nullX.mod (for example, null0.mod
          or null1.mod) where X is the number of active linkage elements.

          There are many special options for LOD calculation.  See
          "help lodp" for more information.  The primary LOD calculating
          procedure in SOLAR is "lod" which lets you specify the
          loglikelihood values and option(s) directly.

          See also lod, lodp.

Shortcuts: lodn - lodn

Return to command index

A1.104 lodp


Purpose:  Change LOD preferences (such as conversion to 1df)

Usage:    lodp [-auto | -off | [-traits <N>] [-rhoq <N>]]
               [-1t|-2t|-3t|-4t|-t1|-t2|-t3|-t4]

          (If no argument is given, current preferences are displayed.)

          -auto    Convert LODs to 1 degree of freedom (1dF) effective LODs 
                   automatically based on traits and rhoq constraints
                   in current model (default).

          -off     Do not perform LOD conversion to 1 df equivalence

          -traits <N>      Convert assuming there are/were <N> traits
          -1t or -t1 etc.  Shortcuts for 1 trait, 2 traits, etc. up to 4
          -rhoq <N>        Convert assuming <N> rhoq's are constrained

Notes:    If -traits is specified without -rhoq, -rhoq is assumed to be 0.
          If -rhoq is specified without -traits, trait count is determined
          automatically (and might be determined to be 1, in which case
          rhoq specification is irrelevant).  If you need to set both
          -traits and -rhoq, you must give both in the same lodp command.

          This should not be confused with lodadj (see).  The lodp command
          sets global preferences for "degrees of freedom" issues which
          arise with multivariate models.  The default "-auto" will
          apply conVERSION BAsed on the number of traits in the current
          model and the number of relevant rhoq's (defined below) which
          are constrained.  LODs will be converted to 1 degree of freedom
          effective LODs (for which the traditional cutoff in statistical
          genetics for a genome-wide linkage scan is 3).

          Relevant rhoq's are parameters prefixed rhoq which correspond
          to the highest numbered linkage element.  For example, in a
          bivariate linkage model with one linkage element, the relevant
          rhoq whould be "rhoq1", but with two linkage elements, it would
          be "rhoq2".  For a trivariate model with one linkage element,
          the relevant rhoq's would be: rhoq1_12, rhoq1_13, rhoq1_23.

          The preferences set by this command will apply to all LOD scores
          subsequently calculated by SOLAR, including those reported by
          the twopoint and multipoint commands, and the lod and lodn
          commands.  The lod command, which is what ultimately
          calculates all LOD scores, has options which are similar to
          lodp.

          Changes to lodp preferences apply only within the current
          SOLAR session, so the command must be re-entered each time
          SOLAR is started or at the beginning of SOLAR scripts when
          you need to change the defaults.

          For more discussion of the how the conversion is performed,
          which rhoq constraints are relevant, etc., see help for
          the lod command.
          
          See also lod, lodn, and lodadj.

Return to command index

A1.105 loglike


Purpose:  Get the log likelihood of the current model

Usage:    loglike

Note:     This could be used in a TCL script like this:

          set firstll [loglike]

          All this procedure does is retrieve the stored loglikelihood.
          The current model must have been maximized first, either with
          the maximize command, or with a command like twopoint or
          multipoint which maximizes and compares many models.  If
          the current model has not been maximized, an error is raised.


Shortcuts: logl - loglikelihood

Return to command index

A1.106 madj


Purpose:  Apply current lodadj to a previous multipoint run

Usage:    madj
          madj -restore    ;# restore ORIGINAL multipoint files
          madj -undo       ;# restore previous multipoint files

Notes: trait or outdir must already have been selected.

       madj applies loadadj from lodadj.info file in trait/outdir.

       madj may be used on incomplete multipoint runs prior to restarting.

       It is not necessary to -restore before applying another lodadj.
       Some roundoff errors occur in last decimal place, but do not
       "accumulate" over multiple runs because LOD's are calculated
       from loglikelihood values, not previous LOD's.  NULL models must
       be present.

       If there is an error, there should either be a "no files modified"
       or "restoring previous files" message.  If not, or if it is
       desired to restore the ORIGINAL multipoint files for any
       reason, use the command "madj -restore."  The FIRST time madj is
       run, those files were saved as multipoint*.save.  (The PREVIOUS
       set of multipoint files were also saved as multipoint*.tmp, and
       may also be restored with the "madj -undo" command.)

Shortcuts: madj - madj

Return to command index

A1.107 map


Purpose:  Process the map data.

Usage:    load map [-haldane | -kosambi | -basepair] <filename>
                                          ; loads map file
          map show [<marker> ...]         ; displays map data
          map unload                      ; unloads map data
          map names                       ; displays marker names
          map fname                       ; returns name of map file
          map chrnum                      ; returns chromosome identifier
          map nloci                       ; returns number of loci in map
          map func                        ; returns mapping function code
                                            ('b'=basepair, 'h'=Haldane,
                                             'k'=Kosambi)

          In the map file, marker locations are typically given in cM.
          When multipoint IBDs are computed, the distances between pairs
          of markers are converted to recombination fractions by means of
          a mapping function.  By default, the Kosambi mapping function
          is assumed.  The Haldane mapping function can also be used by
          specifying the -haldane option when loading the map file.

          Marker locations can also be specified as integer numbers of
          basepairs.  This is useful, for example, when the markers are
          SNPs with known offsets in basepairs from some starting location.
          When basepair locations are used, the mapping function is called
          "basepair" rather than Kosambi or Haldane, but in fact there is
          no mapping provided from basepairs to recombination fractions.
          Therefore, such maps cannot be used to compute multipoint IBDs.

          In map files which contain cM locations, the first line of the
          file can optionally include the name of the mapping function,
          in which case no command line option is required to specify the
          mapping function.  If the load command specifies a different
          mapping function from that specified in the map file, the load
          command option takes precedence.

          Map files which contain basepair locations must either have
          the basepair mapping function specified on the first line of
          the file or be loaded using the -basepair option.

          If a marker name is not specified for the 'map show' command,
          all currently loaded map data will be displayed.

          The name of the currently loaded map file and the mapping
          function are stored in the file 'map.info' in the current
          working directory.  This file persists between SOLAR runs,
          which means that the map file will still be loaded the next
          time SOLAR is invoked (from within the same working directory.)

For a description of the map file, enter 'file-map'


Shortcuts: map - map

Return to command index

A1.108 marker


Purpose:  Process the marker genotype data.

Usage:    load marker [ -xlinked ] <filename>   ; loads marker file
          marker unload [ -nosave ]       ; unloads marker genotype data
          marker discrep [<marker> ...]   ; checks for marker discrepancies
          marker names                    ; displays marker names
          marker show [<marker> ...]      ; displays summary of marker data
          marker fname                    ; returns name of marker file

          The '-xlinked' option of the 'marker load' command can be given
          when loading genotype data for X-linked markers.  Alternatively,
          the XLinked option can be set with the 'ibdoption' command.

          Genotype data will not be unloaded for markers for which MLE
          allele frequencies have been computed but not saved to a file.
          To save MLE allele frequencies, use the 'freq save' command.
          To unload markers without saving MLE allele frequencies, give
          the '-nosave' option in the 'marker unload' command.

          If a marker name is not specified for the 'marker discrep' or
          the 'marker show' command, the command applies to all markers
          currently loaded.

          The state of the currently loaded marker data is stored in the
          file 'marker.info' in the current working directory.  This file
          persists between SOLAR runs, which means that the markers which
          are loaded at the end of a session will still be loaded the
          next time SOLAR is invoked (from within the same working
          directory.)

Notes:    The marker load command creates a subdirectory in the current
          working directory for each marker.  Marker subdirectories are
          named 'd_<marker>', where <marker> is the marker name.  The
          contents of a subdirectory will depend on the type of marker
          processing performed, and will include various input, output,
          and (possibly) error files.  The marker subdirectories are
          deleted when the marker genotype data is unloaded.

          The loci in the marker file must all be autosomal or all be
          X-linked.  By default, SOLAR assumes the loci are autosomal.

          The set of markers in the marker file and the set of markers in
          the freq file do not have to be the same.  Allele frequencies
          will be computed for markers that do not appear in the freq file
          at the time these markers are loaded.

For a description of the marker file, enter 'file-marker'


Shortcuts: mark - markers

Return to command index

A1.109 markertest


Purpose:  Test markerfile for discrepancies; list blankable ID's

Usage:    markertest <markerfile> [<marker>]* [<option>]
          <option> := -1 | -many | -ped | -fam <famfile> | -2

          <markerfile> is either exact filename or pattern including *
          <marker> (optional) is either exact name or pattern including *
          If no markers are specified, all markers in markerfile are tested.
          Each marker is tested individually.

          Results are recorded in markertest.out in current directory.
          During normal operation, many error and warning messages may
          appear.  Ignore these messages until markertest has finished.

          If no options are specified, a flexible procedure is used that
          should work in nearly all cases.  It is the same as markertest -1
          followed by markertest -many if necessary.  IMPORTANT: Read the 
          following two paragraphs to understand how those options work.

          -1     Blank one individual at a time.  Report all individual
                 blankings (if any) that fix discrepancy.  If this succeeds
                 only one of the reported individuals needs to be blanked
                 and it is up to user to pick the best one.  However, this
                 procedure is good (if it works) because it will list all
                 the possibilities, and it is relatively fast.  But if it
                 is necessary to blank more than one individual AT THE 
                 SAME TIME, this procedure will fail, so it is frequently
                 inadequate by itself.

          -many  Blank the first individual, and, if that doesn't fix
                 discrepancy, blank the second individual, and then the
                 third, and so on, until the discrepancy is fixed.  Then, 
                 unblank all the individuals that can be unblanked without
                 a discrepancy returning.  The result is one set of
                 individuals that ALL NEED TO BE BLANKED AT THE SAME TIME.
                 It is not necessarily the only such set of individuals,
                 or the smallest set.  This procedure should always succeed
                 in finding a set of blankable individuals.  (This
                 option used to be named -r.)

          -ped    Rather than blanking only one ID at a time, blank
                  whole "extended pedigrees" at a time.  Blankable
                  pedigrees are identified by pedindex.out PEDNO
                  (the index created by "load pedigree") and
                  by the first ID found in that pedigree.  This procedure
                  is the fastest, and is useful in identifying errant
                  pedigrees, provided there is only one errant pedigree.

          -fam    Rather than blanking only one ID at a time, blank
                  nuclear families (or other groups) identified by
                  "famno."  The "famfile" must contain records
                  including "ID" (permanent ID) and "FAMNO" (other
                  fields will be ignored).  There may be more than
                  one record for each ID.  Records may not use
                  "FAMID" to disambiguate ID's.

          -2      Try blanking combinations of 2 ID's until one such pair
                  of blankings fixes the discrepancy.  Because this is an
                  N x N problem, it may take a VERY LONG TIME to finish, but
                  if you are convinced there is one pair that needs to be
                  blanked, this procedure will find it.

Notes:    Pedigree file must already have been loaded

          Markerfile must have ID and and marker(s) only.  Each marker is
          analyzed separately.  Results for all markers are reported in
          markertest.out, which is divided into one section for each
          marker.

          Output is written to markertest.out which is divided into one
          section for each marker.

Shortcuts: markert - markertest

Return to command index

A1.110 mask


Purpose:  To read image mask file and set current voxel

Usage:    mask [<filename>] [-intensity <intensity>] [-index <index>]
          mask -next
          mask -delete

          <filename> is the name of the file containing the mask
          <intensity> is the integer value that defines this mask
          <index> is position within the set of mask-defined voxels
          -next specifies to advance to the next mask-defined voxel
          -delete deletes the mask and frees all related storage

Currently, the file must be NIFTI/RicVolumeSet.  The mask is assumed to
be volume zero.  If intensity is unspecified, the mask is defined as all
non-zero intensities.  The default starting index is zero, meaning the first
voxel with the specified intensity.  Indexes are advanced in the order
x-first and z-last.

The value returned is the x:y:z voxel value, which is also set as the
current voxel value, which is saved to model files.  The voxel value
determines the voxel obtained when reading image files for image traits.
See the voxel command for more information about the voxel and how
image traits are included in csv files.

The mask remains in effect during the current session and unless cleared by
-delete.  It is not cleared by 'model new'.  The intent is that successive
models can use successive voxels advanced with 'mask -next' which is the
most efficient way to advance to the next voxel.  It is possible though not
required to specify filename, intensity, and index in one command.
It is also possible to change the -intensity or -index in subsequent
commands.  If the intensity is changed, we go back to the first voxel with
that intensity. The index is always absolute relative to the first voxel at
current intensity specification.  'mask -next' starts from the current voxel
which is normally set by the mask command itself, but could also be
re-specified with the voxel command.  If the <filename> has not been
specifed in the current mask command or a previous mask command, all
other options are invalid.

Return to command index

A1.111 matcrc


Purpose:  Prepend pedindex checksum (CRC) to the beginning of a matrix file

Usage:    matcrc [<path>/]<filename> [[<pedindexdir>] [-notzipped]]

          If pedindex.out is in a different directory, the optional
          <pedindexdir> is specified to point to where it is located.
          It defaults to the current directory "."

          If -notzipped argument is prepended the input matrix is not
          unzipped first so it must be unzipped already.

Quick Notes:

           Matrix files in SOLAR include phi2.gz, IBD, and MIBD files.

           If you have modified a pedindex file directly and are subsequently
           having pedigree mismatch errors with your matrix files, but
           are sure everything is correct, you can simply run matcrc on
           all your matrix files to fix the problem quickly.  However it is
           recommended you read all about matcrc first.

           Beginning with version 7.5.0, support for gzipped csv matrix files
           is added.  The checksum will have id1 and id2 set to "checksum"
           and the checksum value will be in matrix1 after the initial
           decimal point.  All other fields are ignored and are simply copied
           from the first record.

Examples: matcrc phi2.gz
          matcrc gaw10mibd/mibd.1.10.gz

          The first two lines in an unzipped matrix file created by SOLAR
          since Version 4.0 will look like this:

              1     1  0.1268587045 .61377
              1     1  1.0000000  1.0000000

          The first line is a checksum used to verify that the current
          pedindex file is the same as the one used when the matrix was
          created.  This checksum is not required when the matrix is read,
          but if it is present, and it does not match the checksum values
          from the pedindex file, an error will stop maximization until the
          problem is fixed.

          The checksum is immediately followed by the actual data for the
          diagonal identity pair 1,1.

          When the matrix is actually being read, from beginning to end, the
          second line values will overwrite the first line values in the
          memory used for matrix storage.  The checksum line will therefore
          have no effect on maximization results.  The numbers in the
          checksum line are only used for matrix/pedindex validation in
          SOLAR versions 4.0 and beyond.

          matcrc is used by all SOLAR procedures that create matrix files
          since version 4.0.  At this time, it need not be used for user
          created matrix files, it is optional now.  But we recommend that
          users writing their own matrix files also use matcrc to
          postprocess their matrix files so that their matrices are also
          protected from being used with a changed pedigree.  Even a slight
          change to a pedigree file can change the numbering of all
          individuals in the pedindex, and thus make a previously created
          matrix file entirely incorrect.

Notes:    The relevant pedindex.out file is assumed to be in the current
          working directory.  The matrix file can be in any other directory
          so long as the relative or absolute path is given.

          The current user must have write access to the matrix file
          and to the directory itself, because during matcrc operation
          the matrix file is unzipped and then rezipped.

          A Posix compliant  "cksum" command is assumed to be available
          on the system.  It produces a quasi-unique polynomial Cyclic
          Redundancy Check (CRC) on the pedindex.out file; the chances of
          any different pedindex.out file having the same CRC is
          astronomically small.  This includes changes such as swapping
          characters or lines.

          The CRC and Number of Octets (NOCT) is prepended to the beginning
          of the matrix in such a way as to be backward compatible with
          earlier versions of SOLAR.  The first line in the actual matrix
          is used as a template, but the coefficients are replaced with
          CRC and NOCT.  Since this is followed by the same line with the
          actual coefficients, the actual data overwrites the preceding
          numbers and the checksum in the preceding line will have no
          ill effect with any version of SOLAR, even those which did not
          use the checksum.

          Beginning with SOLAR version 4.0, when any matrix is loaded,
          the CRC and NOCT (if present) are detected and compared with
          those from the current  pedindex.out file.  When SOLAR
          version 4.0 loads a matrix with no checksum, no warning is given,
          but it may cause a warning or error in some future version.  A
          mismatch in the CRC or NOCT will cause an error when loading.
          The mismatch can be corrected by running matcrc again, because it
          detects whether or not the matrix has already been signed, and
          removes the previous signature if necessary.  However this should
          only be done after the user has carefully validated the pedigree
          and matrix match.  It would be safest to reconstruct all matrix
          file with the current pedigree, if done by SOLAR procedures
          it would not be necessary for the user to run matcrc since
          it is done automatically by those procedures.

          User construction of matrix files is discussed in Sec. 8.3 of
          the manual.  Matrix files are space delimited with semi-fixed
          format.  The first two columns are for the two IBDID values,
          followed by one or two data value columns.  The data value
          columns should  begin in character position 14 or higher
          (counting the first position as 1).  Once gzipped, the file
          should be processed with the matcrc command after loading the
          pedigree file, to generate a safety checksum value.

Return to command index

A1.112 mathmatrix


Purpose: Create and use matrix objects in algebraic computations

Usage:  For matrices m, n, y, x, vector v, and scalar s:
            (vector is a 1 dimensional matrix either row or column)

        times [-e] $m $n   ;# m or n can be scalar, -e means elementwise
        plus $m $n         ;# elementwise addition, or m and/or n can be
                           ;#   scalars added to each element
        plus $v            ;# unary plus is vector sum for v (only vectors)
        minus $m $n        ;# elementwise subtraction, or m and/or n can be
                           ;#   scalars
        transpose $m       ;# matrix transpose
        inverse $m         ;# matrix inverse
        power $m $s        ;# raise or lower each element to power $s
        dinverse $m        ;# Fast inverse for diagonal matrix
                           ;#  Matrix must be diagonal!  This is not checked!

        ols $y $x  ;# ordinary least squares (lldt fastest) for y=xb+e
        solve [<method>] $y $x   ;# other methods, default is
                                 ;# FullPivHouseholderQR
        evalues [$x]  ;# eigenvalues ;# if no x, take from last evectors
        evectors [$x] ;# eigenvectors ;# if no x, take from last evalues

        mean $v  ;# mean of all elements
        min $m|-where ;# returns min of all elements
        max $m|-where ;# returns max of all elements
                      ;# -where returns {x y} for last min or max
        max $m $s     ;# returns matrix with all values less than $s
                      ;# changed to $s...new matrix only if values changed
        insert $m <row> <column> <number>   ;# note one based indexing
        concatenate <1 | 2>  [$m]* [$v]* ;#concatenate matrices and/or
                                         ;#vectors along specified dimension
             ;# in place of 1 | 2 may use vertical | horizontal

        show $m [<row> <column>]    ;# echo element or entire matrix
             if <row> and <column> specified, scalar element is returned
             if no <row> and <column>, return entire matrix pretty printed
                                       as a tcl nested list (can be input to
                                       matrix new)
        output $m <filename>        ;# write out matrix as csv file
        
        row $m <row>       ;# extract row as a column vector: 1,2,...
        col $m <col>       ;# extract column as a column vector: 1,2,...
        diagonal $m [<offset>] ;# extract diagonal as a column vector
                               ;#   offset to superdiagonal or subdiagonal
        diagonal $v            ;# create matrix with v as diagonal (v must
                               ;#   be a Nx1 or 1xN matrix aka vector)
        rows $m            ;# number of rows
        cols $m            ;# number of columns
        shuffle $v         ;# shuffle the elements of vector v
        shuffle $v n       ;# shuffle the elements of vector v into n-1 cols
                           ;# retaining first column unshuffled
        identity <rows>    ;# create an identity matrix

Matrix commands for creating, loading, and deleting matrixes

        matrix new { <row>* }
           <row> = { <number>* }
       load matrix [<option>]* <csvfile>
           <option> = <column-list>|-hoheader
           <column-list> = -cols { <column> }
           <column> = [<name>|'<name>'|<index>|<repeating-sequence>|end]+
           <repeating-sequence> = { <number>+ }

Examples of new and load:

       matrix new {{1 2 3} {4 5 6}}
       load matrix -noheader design.mat.csv  ;# .mat.csv are headerless
       load matrix -cols {{1} 1 age bmi '2020' end} phen.csv
                          ;# first column is all 1's

Delete and reset commands:

        matrix delete $m
        matrix reset         ;# free all MathMatrix storage and ID's
                             ;# do reset as much as possible to free memory

MathMatrix commands

        mathmatrix debug on  ;# print messages as operations are performed

Notes

You can save each returned matrix in a Tcl variable as you do with
other operations, such as:

  set X [load matrix -cols {{1} sex age} phen.csv]
  set Xt [transform X]

Matrices are returned as internally numbered identifiers, starting from

  .mm.1

As shown, you may assign these to variables like any other returns in
SOLAR.  Or you can simply remember the identifiers themselves and use them
directly, skipping the assignments and referencing.  This can be convenient
for interactive use as it may reduce typing, but normal assignment is better
for scripting.

  solar> matrix reset
  solar> load matrix -cols {{1} sex age} phen.csv
  .mm.1
  solar> rows .mm.1
  1009

Actual matrix objects are stored internally in C++.  The storage use
will continue to increase as more operations are done, as every
intermediate matrix created is saved.  As soon as convenient you
should delete matrices or do "matrix reset" to clear
all matrix storage and identifiers.

*matrix reset frees all the memory used by MathMatrix objects and
resets the MathMatrixID to start at .mm.1.  It is recommended to
do this at the beginning and/or end of matrix calculations.

The -cols option may be used in several ways.  You can specify the
columns to be taken from a CSV file, and in what order.  You can
specify the columns by name (the natural way), or by column number
(the first column is defined as column 1) which would be required for
headerless CSV files as produced by other math programs.  In addition
to taking columns from the file, you can specify single numbers or
or sequences of numbers which will be repeated to fill up the column.

Here is an example of loading a Design Matrix out of two named columns from
a CSV file (age and bmi) with a leading column of all 1's:

load matrix -cols {{1} age bmi} phen.csv

The entire column list must be enclosed in curly braces, and sequences
of numbers to be repeated must be further enclosed in an inner set of
curly braces.

Here is an example of loading a headerless csv file having 4 columns in
reverse column order:

load matrix -cols {4 3 2 1} -noheader oldmat.csv

Numbers within the column list are assumed to be column indexes, starting
from 1 for the first column.  If you wish to identify columns in a CSV
file which happen to have numbers as their names, you must enclose them
in single apostrophes.

For example, if phen.csv has a column named "2020" you could use it like
this:

load matrix -cols {age bmi '2020'} phen.csv


The evalues and evectors commands allow you to compute a new real pseudo
eigen decomposition, or simply echo the eigvenvalues or eigenvectors of
the previous decomposition.  This way you can get either one first,
or only if needed.  Generally you should extract your eigenvalues
or eigenvectors as quickly as possible and certainly before calling any
other routines which might also use eigen decompositions.

(The pseudoeigendecomposition is the only type currently supported as
 it guarantees real numbers.  A general eigendecomposition would require
 complex numbers and is not yet supported.)

For example, starting with matrix x, you could get the eigenvalues with
the evalues command, then get the eigenvectors:

set values [evalues $x]
set vectors [evectors]

Or you could do this in reverse order:

set vectors [evectors $x]
set values [evalues]

If a matrix filename ends in ".mat.csv" it will automatically be handled as
headerless matrix file and the -noheader argument is not required.

Return to command index

A1.113 matrix


Purpose:  Set up sample matrix (phi2 etc.) or math matrix

SOLAR has two kinds of matrix, the original sample matrix variables used
during maximization (e.g. phi2) and a newer matrix algebra object used
for high speed math calculations.  For clarity these may be called
"sample matrix" and "math matrix".

Usage:    matrix load [<option>] <filename> <name1> [<name2>]
            ;# loads sample matrix variable(s) (one or two) from file}

          set matname [matrix load <filename>]
            ;# create math matrix from contents of csv file
            ;# (header line is ignored)

          set matname [matrix new {{1 2} {3 4}}]
            ; create math matrix from tcl list of lists
           
          matrix                         ; displays sample matrices
          matrix delete <name>           ; deletes a matrix (either kind)
          matrix delete_all              ; deletes all sample matrices
          matrix reset                   ; deletes all math matrices
          
          matrix debug                   ; print info about sample matrices
          matrix -return                 ; return sample matrix commands

          <option> == -sample            ; remove missing ID's from sample
          <option> == -allow             ; default missing ID's to diagonal 1
          <option> ==                    ; error if matrix missing ID

Notes:    A .gz suffix is appended to the filename, if it is not specified.
          Matrix files are compressed with (GNU) gzip; gunzip must be
          installed in the user's path.

  CSV Sample Matrix Files (recommended for user written matrix files)

          As of version 7.5.4, matrices can be in CSV format, using user ID
          (not pedindex) as the key field.  The required fields are
          id1,id2,matrix1.  The optional fields are famid1,famid2,matrix2.
          If FAMID is required to disambiguate ID's, famid1 and famid2 must
          be included in each record since not all matrices are limited to
          family interactions.  All other fields in a CSV matrix file are
          ignored.  The mapping from ID to pedindex is obtained from the
          currently loaded pedigree file.  It is recommended that users
          create matrix files in the CSV format.  CSV matrix files must
          be gzipped and a .gz suffix must be at end of filename.  For
          more information, give the command "file-matrix".

          As of version 7.5.5, matrix files are checked against the model
          sample during maximization.  By default, if a diagonal matrix
          entry is missing for anyone who would otherwise be included in the
          sample, an error occurs and messages displaying the missing
          individuals will be printed to the terminal.  There are two options
          which permit missing diagonal entries.  -allow permits such
          individuals and their diagonal entry will be defaulted to 1.0.
          -sample removes such individuals from the sample, and a message
          indicating the number of such removed individuals will be written
          to the maximization output file.

          Matrix names beginning with 'ibd' or 'd7' are 'defaultable.'
          This means that if they have a value of -1, that value is replaced
          with Phi2 or Delta7 respectively.  By default, Phi2 and Delta7 are
          calculated internally, but may be overwritten with externally
          provided matrices.  (This feature is obsolescent and should not
          be used in new code.)

  Traditional Format SOLAR Sample Matrix Files

          Traditional format SOLAR matrix files are discussed in Sec. 8.3 of
          the manual.  These are space delimited with semi-fixed
          format.  The first two columns are for the IBDID1 and IBDID2 and
          they MUST have the same fixed width throughout the file.  Then
          there must be one or two data columns, beginning no earlier than
          column 14 (counting the first position as column 1). The data
          columns are not fixed format and can be separated by one or more
          spaces.  The columns for IBDID1 and IBDID2 should be right
          justified and separated by a space.

          To see an example of the traditional matrix file format, you
          can give these SOLAR commands:

          example
          load pedigree gaw10.ped
          gunzip phi2.gz
          head phi2

          This will show output that starts like this:

          1     1  0.1268587045 .61377
          1     1  1.0000000  1.0000000
          2     2  1.0000000  1.0000000
          3     3  1.0000000  1.0000000
          4     4  1.0000000  1.0000000
          5     5  1.0000000  1.0000000
          6     6  1.0000000  1.0000000
          7     7  1.0000000  1.0000000
          8     8  1.0000000  1.0000000
          9     9  1.0000000  1.0000000
      
          This sample shows the following things:

          IBDID1 starts in the first column (column 1) and is 5 columns
          wide (allowing IBDID's up to 32000, the max allowed).
          
          There is one space separating the IBDID's.  This should not
          be a tab and you should not use tabs at all in this file.

          IBDID2 starts in column 7 and is allowed 5 columns.

          There are two spaces.

          VALUE1 starts in column 14 and is free format.

          There is one space separating value 1 and 2.

          The very first record in this file is not real data but
          a checksum for the matching pedigree file.  The second
          record in the file overwrites the checksum values with
          the actual data for 1,1.  The optional checksum is
          checked against the pedigree file during loading to ensure
          the matrix corresponds to the actual loaded pedigree.
          You can add this checksum to any gzipped matrix file by
          running the matcrc command on it, for example:

          matcrc phi2.gz

          This will prepend a checksum for the currently loaded pedigree
          to the matrix file, deleting a pre-existing checksum if needed.
          You must be sure the currently loaded pedigree is the correct
          pedigree for this matrix.  This step is optional but we
          recommend it.

  Using Matrix Files

          The preferred way to set up linkage models is simply to let
          the multipoint, twopoint, and bayesavg commands do everything
          for you.  For more control, you can also use the "linkmod"
          (multipoint) and "linkmod2p" (twopoint) commands to set up
          (but not evaluate) linkage models.  linkmod and linkmod2p
          set up all the required parameters and constraints for you.
          The command "loadkin" will load the phi2.gz matrix file,
          bypassing the usual on-the-fly calculations performed by SOLAR.



Shortcuts: matr - matrixes

Return to command index

A1.114 maximize


Purpose:  Find the maximum loglikelihood of a model by adjusting
          parameter values within specified constraints.

Usage:    maximize [-quiet] [-out <filename>]

              -quiet  (or -q) Use minimum verbosity while maximizing
              -out (or -o)    Write results to this filename.  The default
                              is 'solar.out' in a subdirectory named after
                              the current trait, or the current 'outdir.'
                              If a filename is specified without any /
                              characters, it will also be written in the
                              default location.  If the filename contains
                              / characters, it is used as a full pathname.

              -noquad         Do not test quadratic

              -who            Do not maximize, but list who would be
                              included in analysis.  File "who.out" is
                              written to trait/outdir containing list
                              of pedindex.out sequential ID's.  This
                              option is used by the "relatives" command

               -runwho        Maximize, AND produce who.out file as above.

               -sampledata    Do not maximize, but write out the data that
                              would be included in the analysis to a file
                              named "sampledata.out" in the maximization
                              output directory.  [WARNING!  The fields in
                              this file are preliminary and subject to
                              change!]  The trait data (fields trait1,
                              trait2, etc.) might be from a phenotype or
                              an expression created by the "define" command.

Notes:    This is the key command of solar.  It is used by polygenic,
          twopoint, multipoint, bayesavg, and other scripts to find
          the model parameter values which produce the maximum
          loglikelihood.

          The final values are not saved to a model file.  To do that,
          issue a 'save model' command after maximization.

          Multiple solar processes should not be writing to the same
          directory.  Use the outdir command to specify different output
          directories.

          During maximization, if no phi2 matrix has been loaded,
          one will be loaded from the file phi2.gz.  If that file is
          invalid or unavailable, starting with Version 8.3.1,
          maximization will not be peformed...unless option
          modeltype AutoPhi2.  AutoPhi2 brings back the original
          default of calculating phi2 on-the-fly from the pedigree
          structure in the operative pedigree file.  Beware the
          on-the-fly numbers are invalid in all cases except the
          simplest univariate quantiative, and they are also invalid
          with empirical pedigrees.

          Advanced SOLAR users sometimes use the raw "cmaximize" command
          which bypasses many of the retry mechanisms (and their implicit
          assumptions) now built-in to SOLAR.  This is not recommended for
          most users.


Shortcuts: maxi - maximize

Return to command index

A1.115 memory


Purpose:  Show total memory used by this SOLAR process

Usage:    memory

Notes:    This is intended primarily for internal debugging purposes.
          Now works on all supported systems.

Shortcuts: mem - memory

Return to command index

A1.116 mg_topedsys


Purpose: Convert comma delimited mga.out to pedsys format

Usage: mg_topedsys <input_filename> <output_filename>


Return to command index

A1.117 mga


Purpose:  Run Measured Genotype (MG) association analysis for every SNP

Usage:    mga [-files [<gcovfile>]+ ] [-snps <snp-tcl-list>]
              [-out <outfile>] [-snplists [<snplistfile>]+]
              [-format csv | pedsys | fortran]  [-noevd] [-notsame]
              [-saveall] [-slowse] [-evdse]
              [-fixupper <boundary>] [-fixlower <boundary>]
              [-ixsnp <SNP>]

SPECIAL NOTE: no filenames or snp names should begin with hyphen (-)
              SNPS should be specified by their actual names, but the
              corresponding file variables should be prefixed by snp_
              The snp_ prefix is added automatically by SOLAR command
              snp covar.

Before invoking mga, user should set up trait and covariate(s) for
the null model.  It is not necessary to maximize the model prior to running
mga.  You may choose to set up the modeltype and omega, but it is not
necessary.  The omega will default to polygenic and the modeltype will be
set to evd (evd1).

When default evd1 models are used, mga can automatically detect when the
sample has changed, and run new null models as needed.  This may be the
fastest approach in most cases.

To use evd2 instead, specify "option modeltype evd2" just before running mga.
For evd2, both -evdse and -slowse produce the same fast evd2 standard
errors.  Evd2 may be faster for individual pedigrees larger than 3000.

At mga completion, mga_null will be the last null model created.
If you wish to save more information, use the -saveall option
described below) to save all maximization output files.

-ixsnp   Do an interaction test with the specified SNP and each one of
         snps in the <snp-tcl-list> or <snplistfile>.  Additional
         terms for chi's and p values related to the interaction will be
         included in the output file.  DO NOT include this snp in the
         starting model, it will be added as needed.

-noevd   Do not use EVD fast maximization or sample checking.  Normally
         this is not needed because mga tests whether EVD can be
         used and changes to -noevd model when EVD cannot be run.  However,
         sometimes EVD uses too much memory, so this can help.*

-notsame  Samples not same, therefore always run models for each SNP!
  Normally this is not needed because mga tests the sample automatically.

If "option samplesametrustme 1" is given prior to running mga,
  mga will only run one null model for the first sample, unless
  EVD is used and fast checking is available.

 By default (starting with version 7.4.2) standard errors are computed
 using the formula sqrt(beta^2/chi^2).  The -slowse and -evdse options
 are now considered obsolescent.  If standard errors cannot be computed
 because chi is zero, the value 10e20 (10^20) will be returned.

-slowse Estimate standard errors, not using EVD
-evdse  Estimate standard errors, using EVD if possible

 The -slowse option will disable EVD because standard errors computed when
 using EVD in the current EVD1 implementation are sometimes inaccurate.

 The -evdse option will compute standard errors and use EVD if
 possible, which will be faster if EVD can be used.  A warning will be
 given about possible SE inaccuracy.

If no -snps or -snplists arguments are given, mga will process
all snp_ prefixed covariates in the currently loaded phenotypes files.

<gcovfile> is one or more snp.genocov file generated by 'snp covar' command.
These files are scanned for snp_ prefixed covariates in addition to the
initially loaded phenotypes files.

<snp-tcl-list> is a tcl-list of snps.  If specified, the -snps list
supercedes -snplists.

<snplistfile> is a file listing the snps to be processed.  Each snp
is listed on a separate line with no blanks or other characters
The snp_ prefix is not required, but allowed. 

Currently, no checking is done to see if any listed snp is found or
not found, or duplicated in subsequent files.

<outfile> defaults to mga.out in the current output directory, which is
cleared out first, then one line of information is produced for each SNP.
If another filename is specified, it is located relative to the output
directory or full absolute pathname, and it is not erased first to permit
accumulation.  Note: if the "mgassoc" (original) command name is used, the
output file is named mgassoc.out by default.

-debug      Print extra messages useful in debugging, including all
            maximization output

-format     csv, fortran, pedsys.  csv is default.  fortran
            and pedsys formats are identical space delimited except
            that fortran version has a header line at top, pedsys version
            writes a code file (.cde).  Actually, both fortran and
            pedsys options write the code file in case you need it later.

            If you have written pedsys format, it can be converted to
            comma delimited with the command ped2csv.  If you have written
            a comma delimited format file, you can convert it to pedsys
            format with the command mg_topedsys.

-saveall    save all solar.out output files.  Each output file will be named
            mga_<snpname>.out.  Null model output files will be named
            mga_null_<snpname>.out based on the first <snpname> they
            were created for (null will be re-used if sample unchanged).

-fixlower   fix snp beta parameter lower boundaries to this value
-fixupper   fix snp beta parameter upper boundaries to this value

Notes:

 The genotype covariates are numeric variables giving the observed
(or imputed) number of rare alleles an individual has at a particular SNP.

 These fields must be named'snp_<snp>' where <snp> is the SNP name. 
 Ex: snp_rs12345

 *EVD is not currently available for multivariate or discrete models.
 However, discrete traits may be handled as quantitative (at some loss in
 accuracy) and therefore used with EVD by using either
 "option enablediscrete 0" or (generally preferred, to increase SD):
 define qtrait = dtrait * 5

Return to command index

A1.118 mgassoc


solar::mga

Purpose:  Run Measured Genotype (MG) association analysis for every SNP

Usage:    mga [-files [<gcovfile>]+ ] [-snps <snp-tcl-list>]
              [-out <outfile>] [-snplists [<snplistfile>]+]
              [-format csv | pedsys | fortran]  [-noevd] [-notsame]
              [-saveall] [-slowse] [-evdse]
              [-fixupper <boundary>] [-fixlower <boundary>]
              [-ixsnp <SNP>]

SPECIAL NOTE: no filenames or snp names should begin with hyphen (-)
              SNPS should be specified by their actual names, but the
              corresponding file variables should be prefixed by snp_
              The snp_ prefix is added automatically by SOLAR command
              snp covar.

Before invoking mga, user should set up trait and covariate(s) for
the null model.  It is not necessary to maximize the model prior to running
mga.  You may choose to set up the modeltype and omega, but it is not
necessary.  The omega will default to polygenic and the modeltype will be
set to evd (evd1).

When default evd1 models are used, mga can automatically detect when the
sample has changed, and run new null models as needed.  This may be the
fastest approach in most cases.

To use evd2 instead, specify "option modeltype evd2" just before running mga.
For evd2, both -evdse and -slowse produce the same fast evd2 standard
errors.  Evd2 may be faster for individual pedigrees larger than 3000.

At mga completion, mga_null will be the last null model created.
If you wish to save more information, use the -saveall option
described below) to save all maximization output files.

-ixsnp   Do an interaction test with the specified SNP and each one of
         snps in the <snp-tcl-list> or <snplistfile>.  Additional
         terms for chi's and p values related to the interaction will be
         included in the output file.  DO NOT include this snp in the
         starting model, it will be added as needed.

-noevd   Do not use EVD fast maximization or sample checking.  Normally
         this is not needed because mga tests whether EVD can be
         used and changes to -noevd model when EVD cannot be run.  However,
         sometimes EVD uses too much memory, so this can help.*

-notsame  Samples not same, therefore always run models for each SNP!
  Normally this is not needed because mga tests the sample automatically.

If "option samplesametrustme 1" is given prior to running mga,
  mga will only run one null model for the first sample, unless
  EVD is used and fast checking is available.

 By default (starting with version 7.4.2) standard errors are computed
 using the formula sqrt(beta^2/chi^2).  The -slowse and -evdse options
 are now considered obsolescent.  If standard errors cannot be computed
 because chi is zero, the value 10e20 (10^20) will be returned.

-slowse Estimate standard errors, not using EVD
-evdse  Estimate standard errors, using EVD if possible

 The -slowse option will disable EVD because standard errors computed when
 using EVD in the current EVD1 implementation are sometimes inaccurate.

 The -evdse option will compute standard errors and use EVD if
 possible, which will be faster if EVD can be used.  A warning will be
 given about possible SE inaccuracy.

If no -snps or -snplists arguments are given, mga will process
all snp_ prefixed covariates in the currently loaded phenotypes files.

<gcovfile> is one or more snp.genocov file generated by 'snp covar' command.
These files are scanned for snp_ prefixed covariates in addition to the
initially loaded phenotypes files.

<snp-tcl-list> is a tcl-list of snps.  If specified, the -snps list
supercedes -snplists.

<snplistfile> is a file listing the snps to be processed.  Each snp
is listed on a separate line with no blanks or other characters
The snp_ prefix is not required, but allowed. 

Currently, no checking is done to see if any listed snp is found or
not found, or duplicated in subsequent files.

<outfile> defaults to mga.out in the current output directory, which is
cleared out first, then one line of information is produced for each SNP.
If another filename is specified, it is located relative to the output
directory or full absolute pathname, and it is not erased first to permit
accumulation.  Note: if the "mgassoc" (original) command name is used, the
output file is named mgassoc.out by default.

-debug      Print extra messages useful in debugging, including all
            maximization output

-format     csv, fortran, pedsys.  csv is default.  fortran
            and pedsys formats are identical space delimited except
            that fortran version has a header line at top, pedsys version
            writes a code file (.cde).  Actually, both fortran and
            pedsys options write the code file in case you need it later.

            If you have written pedsys format, it can be converted to
            comma delimited with the command ped2csv.  If you have written
            a comma delimited format file, you can convert it to pedsys
            format with the command mg_topedsys.

-saveall    save all solar.out output files.  Each output file will be named
            mga_<snpname>.out.  Null model output files will be named
            mga_null_<snpname>.out based on the first <snpname> they
            were created for (null will be re-used if sample unchanged).

-fixlower   fix snp beta parameter lower boundaries to this value
-fixupper   fix snp beta parameter upper boundaries to this value

Notes:

 The genotype covariates are numeric variables giving the observed
(or imputed) number of rare alleles an individual has at a particular SNP.

 These fields must be named'snp_<snp>' where <snp> is the SNP name. 
 Ex: snp_rs12345

 *EVD is not currently available for multivariate or discrete models.
 However, discrete traits may be handled as quantitative (at some loss in
 accuracy) and therefore used with EVD by using either
 "option enablediscrete 0" or (generally preferred, to increase SD):
 define qtrait = dtrait * 5


Shortcuts: mgas - mgassociation

Return to command index

A1.119 mibd


Purpose:  Compute multipoint IBDs.

Usage:    mibd relate [-mxnrel <n>]   ; creates relative-class file
          mibd merge                  ; merges marker IBDs
          mibd means [-typed | -all]  ; computes mean IBD by relative-class
          mibd [<from> <to>] <incr>   ; computes multipoint IBDs

          mibd export [-file <filename>] [-overwrite] [-append]
                      [-nod7] [-ibdid] [-byloc] [<chromo> ...]
                                    ; writes MIBDs for specified chromosomes
                                    ; to a file in comma delimited format

          mibd import [-file <filename>] [-nod7] [-ibdid] [<chromo> ...]
                                    ; reads MIBDs for specified chromosomes
                                    ; from a file in comma delimited format

          mibd prep <program> [-version 2.82] [-usefreq] [-qter]
                                    ; prepares input files needed to compute
                                    ; MIBDs using <program>, where <program>
                                    ; is loki, simwalk (sw), merlin, or
                                    ; genehunter (gh)

          mibd import <program> [-file <filename>] [-version 2.82]
                                    ; imports MIBDs from an output file
                                    ; computed by <program>, where <program>
                                    ; is loki, simwalk (sw), merlin, or
                                    ; genehunter (gh)

          Before any mibd command can be run, the directory in which to
          store the mIBDs must be specified with the 'mibddir' command.  This
          specification is stored in a file mibddir.info in the working
          directory, so it need not be repeated in future sessions from
          the same working directory.

          The first record in all matrix files produced by SOLAR, including
          mIBD matrix files, is a checksum and not real data; see the matcrc
          command for details.  This checksum is optional in user created
          matrix files. If present, it prevents a using matrix with a 
          different or changed pedigree.
          
          The 'mibd relate' command can be run after the pedigree
          file has been loaded, and only needs to be run once per data set.
          A tally of the relative classes present in the data set can then
          be displayed with the 'pedigree classes' command.

          A pair of individuals may be related in multiple ways, e.g. as
          1st cousins and as 2nd cousins.  To conserve memory, there is a
          default limit on the number of ways any two individuals may be
          related.  For some complex pedigrees, it may be necessary to
          specify a higher limit using the '-mxnrel' option.

          The remaining commands in the first group of mibd commands shown
          above must be run once for each chromosome.  The 'mibd merge'
          command must be run first, followed by the 'mibd means' command.

          The 'mibd means' can take one of two options: -typed or -all.
          If the -typed option is specified, only the IBDs for pairs of
          individuals who are both genotyped will be used to compute mean
          IBDs by relative class.  If the -all option is specified, the
          IBDs for all pairs of individuals are used.  The default option
          is -all.

          The following steps are required before computing multipoint IBDs
          for chromosome N:

            1. Compute the marker-specific IBDs for all markers on
               chromosome N.  For more information on computing marker-
               specific IBDs, enter 'help ibd'.
            2. Load the map file for chromosome N.
            3. Use the 'ibddir' command to specify the directory where
               the marker-specific IBDs are stored.
            4. Use the 'mibddir' command to specify the directory where
               the multipoint IBDs are to be written.

          Only the last of the first four mibd commands shown above need
          be entered (for a particular chromosome.)  If the merged IBD
          file does not exist, 'mibd merge' will automatically be run to
          create it.  If the mean IBD file does not exist or is older than
          the merged IBD file, 'mibd means' will be run.  Note that when
          'mibd means' is run automatically, the default option, -all,
          will be used.  The 'mibd means' command must be issued directly
          in order to use the -typed option.  If any of the marker IBD
          files is newer than the merged IBD file, a warning message will
          be displayed.  In order to update the merged IBD file, the
          'mibd merge' command must be issued directly - this will not be
          done automatically.

          The 'mibd export' command outputs the multipoint IBDs for a
          specified set of chromosomes into a comma delimited file.
          The MIBDs must be stored in the directory named in the 'mibddir'
          command.  If no chromosomes are specified, then all multipoint
          IBDs found in the 'mibddir' directory are exported.  By default,
          the SOLAR indexed IDs (IBDIDs) in the MIBD files are translated
          to permanent IDs, and family IDs are included when present in
          the pedigree file.  The default name for the output file is
          "solar-mibd-export.out".  The default fields in the output file
          are CHROMO, LOCATION, [FAMID,] ID1, ID2, IBD, and D7, where
          LOCATION is the chromosomal location in cM.

          WARNING: The file to which MIBDs are exported can become very
          large. To keep export files to a manageable size, it may be best
          to export MIBDs on a per-chromosome basis, i.e. one export file
          per chromosome, or on a per-location basis by using the -byloc
          option.

          The options for the 'mibd export' command are

              -file (or -f)         Export MIBDs to this filename.

              -overwrite (or -ov)   Overwrite existing output file.

              -append (or -a)       Append MIBDs to existing output file.

              -nod7                 Don't include D7 field from MIBD files.

              -ibdid                Write out SOLAR indexed IDs (IBDIDs)
                                      rather than permanent IDs.

              -byloc                Export MIBDs on a per-location basis,
                                    i.e. one export file per location. The
                                    export files are given unique names by
                                    appending the chromosome number and the
                                    location to the filename given by the
                                    -file option.

          The 'mibd import' command inputs the multipoint IBDs for a
          specified set of chromosomes from a comma delimited file.
          MIBD files are written and stored in the directory named in the
          'mibddir' command.  If an MIBD file for an imported chromosomal
          location already exists, it is overwritten.  By default, the
          permanent IDs in the input file are translated to SOLAR indexed
          IDs (IBDIDs).  Family IDs must be included in the input file
          when they are present in the pedigree file.  The default name
          for the input file is "solar-mibd-import.in".  The default
          fields in the input file are CHROMO, LOCATION, [FAMID,] ID1,
          ID2, and IBD.  If the input file does not contain a D7 field,
          all D7 values in the MIBD files are set to zero.  By default,
          all MIBDs in the input file are imported.  If chromosomes are
          specified on the command line, however, MIBDs are imported for
          those chromosomes only.

          NOTE: The order of the chromosomes and chromosomal locations
          in the input file is unimportant, but all the lines for a given
          chromosomal location MUST BE ADJACENT.  To be safe, you may want
          to sort the input file by chromosome and chromosomal location
          to ensure that the input file is ordered correctly.

          The options for the 'mibd import' command are

              -file (or -f)         Import MIBDs from this filename.

              -nod7                 Don't take D7 from input file; set D7
                                      to zero instead.

              -ibdid                Input file contains SOLAR indexed IDs
                                      (IBDIDs) rather than permanent IDs.


          The 'mibd prep' command generates the input files needed to
          compute multipoint IBDs using a program other than SOLAR.
          The programs currently supported are Loki, SimWalk2, Merlin and
          GeneHunter.  Before this command can be run, marker data and a
          map file must have been loaded.  The input files are generated
          from various files created by SOLAR when pedigree and marker
          data are loaded and so contain SOLAR indexed IDs (IBDIDs).

          The marker locations written to an input file will be in Haldane
          cM.  If the user has loaded a Kosambi map file, the necessary
          conversion to Haldane is made automatically.  By default, IBDs
          will be calculated at every integer cM location from 0 to the
          last marker in the map file.  The '-qter' option extends the range
          of locations to the end of the chromosome.  For each chromosome,
          SOLAR defines qter as the nearest integer location greater than
          or equal to the position (in Haldane cM) of the last marker on
          that chromosome in the deCODE map.

          The allele frequencies in effect, whether read from a file or
          computed by SOLAR, are passed to the multipoint IBD calculation
          program, except in the case of Loki, for which the default action
          is to let Loki estimate the allele frequencies.  The '-usefreq'
          option can be used to force Loki to use the current allele
          frequencies.

          NOTE: After the input files have been created, the user must exit
          SOLAR and run the external program to compute the multipoint IBDs.
          Once the IBD calculations are complete, the resulting output file
          can be imported into SOLAR using the 'mibd import <program>'
          command.

          The 'mibd import <program>' command reads an output file which
          contains the multipoint IBDs computed by a program other than
          SOLAR, and imports those IBDs to create SOLAR-format MIBD files.
          The programs currently supported are Loki, SimWalk2, Merlin
          and GeneHunter.  This command is designed to work with the
          'mibd prep' command, so the output file is assumed to contain
          SOLAR indexed IDs (IBDIDs), not the real IDs from the pedigree
          data file.  Before this command can be run, the 'mibddir'
          command must have been given to specify the directory in which
          the MIBD files are to be stored.

          NOTE: For both the 'mibd prep' and 'mibd import' commands, if
          SimWalk2 is the program chosen, then it is assumed that version
          2.91 or a newer version of SimWalk2 will be used to compute the
          multipoint IBDs. In previous versions of SOLAR, SimWalk2 version
          2.82 was assumed. Due to a backward incompatibility in file
          formats that was introduced in later SimWalk2 versions, if you
          wish to use the earlier version of SimWalk2, it is now necessary
          to include the '-version 2.82' option.

Notes:    The computed multipoint IBDs are stored in gzipped files with
          names of the form 'mibd.<chromo>.<loc>.gz', where <chrom> is the
          chromosome number and <loc> is the chromosomal location.

          Several additional files are created and used for multipoint
          IBD calculation:

             mibdrel.ped       relative-class and kinship information
             mibdchrN.loc      marker locations on chromosome N
             mibdchrN.mrg.gz   merged marker-specific IBDs for chromosome N
             mibdchrN.mean     mean IBD by relative class for chromosome N


Shortcuts: mibd - mibd

Return to command index

A1.120 mibddir


Purpose:  Set directory in which MIBD matrix files are stored
           (multipoint only; use ibddir to set up twopoint)

Usage:    mibddir <dirname>     ; set directory for MIBD files
          mibddir               ; show current mibddir
          mibddir -session      ; show mibddir entered in this session

Notes:    The mibddir selected is saved in file mibddir.info for
          future SOLAR sessions.  Once a midddir is selected, it
          need not be selected again within the same working directory,
          EXCEPT for the purposes of writing out mibd files.  To
          prevent accidentally overwriting pre-existing mibd files,
          it is necessary to explicitly enter the mibddir
          command before using the mibd command or other commands
          which write files into the mibddir.

Shortcuts: mibdd - mibddir

Return to command index

A1.121 miniplot


Purpose:  Arrange miniature plots on a single page

Usage:    miniplot [-pass <pass>] [-allpass] [-plots <number>] 
                   [-port] [-land]

            See also "plot -all"

          -pass       Do this pass number (default is 1)
          -allpass    Do all passes, each on a separate page
          -plots      Put this many plots on a page
          -port       Portrait layout
          -land       Landscape layout
          -nodisplay  Generate postscript, but don't display

         Output file named passN.out (pass01.out for pass 1) in trait/outdir
         is created.  The trait or outdir must have been specified previously
         and the plots must have been created previously (see usage for
         example).

         The individual chromosome plots should have been created previously
         using the "plot" command.  In fact, "plot -all" or "plot -allpass"
         will invoke miniplot automatically.

         This requires that Python (1.5.2 and later works, maybe earlier)
         be installed.  If you do not have python, use "plot -string"
         instead.

Shortcuts: minipl - miniplot

Return to command index

A1.122 model


Purpose:  Describe, save, or load a model

Usage:    save model <modelname>     ; save current model to a file
          load model <modelname>     ; load model from a file
          model                      ; display model on terminal
          model new                  ; reset to new empty model

Notes:    An extension .mod is automatically appended if not specified.
          You must specify directory path if you want to save model
          in a subdirectory of the current directory.
- 

Shortcuts: model - model

Return to command index

A1.123 mu


Purpose:  Set or Display the Mu equation (trait value estimator)

          Usually the covariate command is used to set this automatically,
          but the mu command provides more advanced capabilities, such as
          using log or sine functions.

Usage:    mu                        ; displays Mu equation
          mu = mu + <expression>    ; add new terms to Mu equation
          mu = <expression>         ; replaces Mu equation (SEE DISCUSSION!)
          mu reset                  ; restores default Mu equation

          <expression> may include mathematical operators (+-*/^), 
          constants, parentheses, any mathematical functions defined in
          the C programming language, any phenotypic variables included
          in the analysis, sex, and for any variable "var" x_var (the
          sample mean), min_var (the sample minimum), and max_var
          (the sample maximum).  Parameters whose names include
          any erstwhile operators including parentheses, *, or ^ must
          be enclosed in angle brackets <> to prevent being parsed as
          functions; note this always happens for bivariate models, or when
          there are interaction covariates such as age*sex, or squared
          covariates such as age^2.  For bivariate models, you can also
          include "t1" and "t2": t1 is 1 if the mu is being evaluated
          for the first trait, and 0 otherwise, and t2 has this behavior
          for the second trait.  All variables included in the mu will be
          required in the sample.

          Also it is possible for the mu to include inequality operators
          (such as >=) and the "print" function (for debugging purposes).
          In these regards, the mu expression is like the omega expression.
          See "help omega" for more about inequalities and print, and
          a complete listing of the mathematical functions available.
          
Discussion:

 The "mu" embodies the estimation of the trait value for any individual
 based on the sample mean and their covariate values.  It does not
 normally include genetic effects.  The difference from this estimation
 and the actual value is used to determine genetic and other
 intra-individual effects.  Thus, "mu" is evaluated in the context of
 of a single individual, and NOT a pair of individuals as with "omega".

 You can get examples of many possible "mu" commands by using the
 mu command to display the current mu equation for different
 models.  For example:

 solar> model new
 solar> trait q1
 solar> covar age
 solar> mu
 mu = \{Mean+bage*(age-x_age)\}

 First notice that the entire body of this default mu equation is
 delimited by \{ and \} characters.  This is the portion which is
 automatically generated by SOLAR and will be changed automatically
 if your covariates are changed.  You should not normally edit this
 portion of the mu.  If you need to change the mu, you can either
 augment this portion with an additional expression, or replace the
 mu altogether with a new expression, in which case you must leave
 out the \{ and \} delimiters.  If you replace the mu altogether with
 a new expression, you are then responsible for including terms for
 covariates (if any) and it is not necessary to use the "covariate"
 command at all.

 The Mean and bage terms refer to parameters in the model, the age term
 refers to a data variable "age" found in the phenotypes file, and the
 term x_age refers to the average age of all individuals in this sample.
 You may include similar terms in any new mu expression.

 Adding To The mu

 You can add additional terms either by appending them onto the mu shown
 by the mu command (using terminal cut and paste makes this convenient)
 or using the "mu = mu + ..." shorthand.  For example, using the
 shorthand, you could add a new term for log(weight-100) as follows:

 solar> mu = mu + log(weight-100)

 OR by entering the following:

 solar> mu = \{Mean+bage*(age-x_age)\} + log(weight-100)

 in either case, the result would be the same:

 solar> mu
 mu = \{Mean+bage*(age-x_age)\} + log(weight-100)

 If you then added another covariate, that would be included automatically
 in the default portion of the mu:

 solar> covar sex
 solar> mu
 mu = \{Mean+bage*(age-x_age)+bsex*Female\} + log(weight-100)

 Notice here that the variable "Female" changes according to the sex.
 It is "0" for male and "1" for female.

 Replacing the Mu

 You can also replace the Mu altogether, removing the "default portion."
 If you remove the specially delimited "default portion" from the mu,
 your covariate commands will have no effect on the mu, and you will
 either have to write the beta parameters into the mu yourself or
 remove the covariates altogether.  All phenotypic variables you
 write into the model will be required for all individuals to be
 included in the sample.

 Continuing our example:

 solar> covariate delete_all
 solar> mu
 mu = \{Mean]}
 solar> mu = Mean + log(weight-min_weight)
 solar> mu
 mu =  Mean + log(weight-min_weight)

 The Mu can be as elaborate as you like, including any mathematical
 functions defined in the "C" programming language.  It need not include
 the "Mean" parameter (in fact you do not even need a Mean parameter in
 SOLAR anymore).

 If you removed the default mu by mistake and need to restore it,
 use the "mu reset" command.

 Bivariate Mu

 solar> model new
 solar> trait q1 q2
 solar> covar sex
 solar> mu
mu = \{t1*(<Mean(q1)>+<bsex(q1)>*Female) + t2*(<Mean(q2)>+<bsex(q2)>*Female)\}

 Notice that the mu for this bivariate model has separte terms for the first
 and second traits, which are identified by "t1" and "t2".  (The variable
 "t1" is true if the first trait is being estimated, and false if the
 second trait is being estimated.  If you replace the mu, any terms not
 multiplied by "t1" or "t2" will be applied to the estimation of both
 traits, and you may have as many (or as few) t1 and/or t2 terms as you
 need.

Additional Notes:

 (1) Use the "mu = mu + <expression>" as described above instead of
     the now obsolescent "mu = <expression> + mu" to add to the mu.
     Also, you may notice that square brackets are no longer used
     to delimit the default mu.  They did not work as had been intended.
     The default portion of the mu is now delimited by \{ and \} which
     may be included in a user specified mu.  Everything within the
     delimiters is maintained by SOLAR and editing this portion will
     have no effect.  It is simply displayed for informational purposes.
     If the mu is defaulted, models will be saved with a mu "comment"
     for informational purposes only; the actual mu is determined by
     the covariates.

 (2) As terms in the mu equation, you may use any constant, any
     parameter, Sex, Mean, and any Phenotypic variable.  There are
     also predefined terms for any phenotype named 'var': x_var
     (the sample mean), min_var (the sample min), and max_var (the
     sample max).  Any math operator (+,-,*,/) and function defined 
     in the C programming language may be used.  Also, the ^ character
     may be used to indicate exponentiation.

 (3) Parameter names which include * or ^ should be enclosed in
     <> angle brackets to prevent the names from being interpreted
     as multiplication and/or exponentiation expressions:

          mu = Mean + <bage*sex>*(age-x_age)*Female

 (4) The default mu expression will display all variables as being
     adjusted to their mean (e.g. age-x_age).  However, during
     maximization, if a variable is found to be binary, the
     variable is adjusted to its minimum (e.g. diabet-min_diabet)
     instead.  This will be reflected after the first maximization.
     User-created mu equations must always correctly specify
     either the mean (x_) or min (min_) variable as required.


Shortcuts: mu - mu

Return to command index

A1.124 multipoint


Purpose:  Perform a multipoint analysis.
            Scan loci on selected chromosomes at selected interval
            (use chromosome, interval, and finemap commands beforehand)

Usage:   multipoint [<LOD1> [<LOD2> [<LOD3> ...]]] [-overwrite] [-restart]
                    [-renew mod] [-nullbase] [-plot] [-score]
                    [-cparm <plist>] [-rhoq <fixed value>] [-saveall]
                    [-ctparm <plist>] [-se]

         Zero or more criterion LOD scores may be specified.  If none are
         specified, multipoint will make one full scan and then stop.  If
         one LOD score is specified, multipoint will continue scanning
         until the highest LOD found in the last scan is no longer
         greater than or equal to the LOD score specified.  If more than
         one LOD score is specified, each LOD will apply after one scan
         has been completed.  Then the last LOD specified will remain in
         effect.

         -overwrite  (or -ov) Overwrite existing multipoint output files.

         -restart    (or -r) Restart previous multipoint run

         -nose       Don't bother computing standard errors in best
                     models (S.E.'s are not normally computing while
                     scanning anyway).  This should not be combined with
                     -slod option.

         -plot       plot each point while scanning (uses plot -quick)
                       Shows the current chromosome in the current pass,
                       ending with the last chromosome in the last pass.
                       To view previous passes, or for best quality plot,
                       use the plot command.  The plot command may be run
                       simultaneously in other SOLAR sessions plotting the
                       same data from the multipoint*.out files.  For more
                       information, see help for the plot command.

         -score        Use Score based LOD (S-LOD) defined as:
                       SLOD (score(i)^2 * SE(i))/(2 ln (10)) (where i is 
                       the index of the new parameter).

         -cparm <plist>  Custom parameters.  (See also -ctparm.)  This is
                         discussed in Section 9.5 of the manual.  Scanning
                         will consist of replacing one matrix with another
                         matrix, everything else is unchanged.  The
                         starting model MUST be a "prototype" linkage model
                         will all the desired parameters, omega, and
                         constraints.  Starting points and boundaries for
                         all variance parameters must be explicitly
                         specified.  Following the -cparm tag, there must be
                         a list of parameters in curly braces that you want
                         printed out for each model.  The list can be empty
                         as indicated by an empty pair of curly braces {}.
                         The matrix to be replaced must have name mibd or
                         mibd1, mibd2, etc.  The highest such mibd will be
                         replaced.  If the loaded matrix has two columns,
                         each succeeding matrix will also be loaded with two
                         columns.  There must be a model named null0 in
                         the maximization output directory for LOD
                         computation.  See section 9.5 for an example of
                         custom parameterization.  Note: the user's
                         starting model is saved in the output directory
                         as multipoint.template.mod.  Any or all parameters
                         in the <plist> may also be multiple-term
                         expressions.  See second example below.

                         After revision in version 4.1.5, -cparm now
                         reloads the prototype model at the beginning of
                         each chromosome or during finemapping if there is
                         a gap greater than 11cm.  This provides much more
                         stable operation of -cparm and fixes the problems
                         that led most people to use -ctparm.  However,
                         -ctparm may be preferable in some cases where
                         there are convergence errors.  Or vice versa.
                         Another strategy is to set the interval to 1.
                       
         -ctparm <plist> Custom parameters, as -cparm, but rebuilding each
                         model from the "prototype" linkage model.  This
                         might be slower, but it has the advantage of
                         greater reliability.  If any model ends up with
                         parameter(s) on boundaries, it has no ill effect
                         on the remaining models.

         -se           Calculate standard errors in all linkage models.
                       Otherwise, they are always NOT calculated.  This
                       is mainly useful in conjunction with -cparm
                       and -ctparm.  See second example below.
                       

         -link <proc>  Use specified (by name) procedure instead of
                       linkmod (default) to move to the next locus.
                       The procedure requires 1 argument which is the
                       full relative or absolute pathname to the mibd file.
                       For now, it should ignore additional arguments
                       (use a trailing "args" argument to do this).

         -nullbase     Reload null model as base for each linkage model.  
                       The default is to start from the previous linkage
                       model if on the same chromosome.

         -epistasis N   Use current loaded model as the base for a one-pass
                        epistasis scan.  N is the index of the mibdN to
                        be included in epistatic interactions (e.g. 1 for
                        mibd1).  An additional parameter H2qE1 will be
                        added for the interaction term (for mibdN and
                        mibd<scan>).  Output files will be named
                        multipointe.out and multipointe1.out.  Only one
                        epistasis pass is currently supported; if
                        oligogenic scanning is desired that should be
                        done first before running an epistasis scan.  At the
                        one QTL where mibdN and mibd<scan> are the same,
                        h2q<scan> is constrained to zero (it is not and
                        should not be constrained to zero elsewhere).

         -rhoq <value>  Constrain rhoq parameters to <value>.

         -saveall       Save the multipoint linkage models for every locus
                        tested, not just the best ones.  The filenames look
                        like this: multi.pass1.2.3.mod for pass 1, chromosome
                        2, locus 3.  The maximization output files are saved
                        also following the same naming convention but with
                        a .out suffix.  Warning!  This can fill up a lot of
                        harddrive space quickly.  It is recommended to
                        restrict this to a chromosome and/or range (set with
                        the interval command) of interest.
                        
Examples: multipoint 3 1.9

         This will first do a full scan and zero-in scan once, then, if the
         highest LOD >= 3, it will scan again.  If the next highest 
         LOD >= 1.9, it will continue scanning until the last highest
         LOD < 1.9.

         trait q4
         polymod
         maximize
         save model q4/null0
         linkmod gaw10mibd/mibd.9.1.gz
         option standerr 1
         multipoint -ctparm {h2r h2q1 {par h2q1 se}} -se

         This illustrates a simple use of the "custom parameterization"
         option.  Note that unlike the typical use of the multipoint
         command, it is necessary to create a "prototype" linkage model
         first (here it is done with the linkmod command, but one might
         also use linkqsd or build the model "by hand" setting up the
         parameters and omega).  The list of parameters following -ctparm
         may also include commands enclosed in a second level of braces.
         The command must include more than one element as it is not
         the braces but the element length that determines whether the
         element is interpreted as a parameter or a command.
         In this example, a command extracts the standard error of h2q1.
         
Requires:    mibddir, chromosome, and interval commands must have been
             given to select mibd files to use.  finemap may be adjusted
             with finemap command.

             There must be a null0.mod model in the trait or outdir
             directory.  This can be created with the polygenic command
             prior to running multipoint.  (This model may include
             household and covariate effects.  See the help for the
             polygenic command for more information.)

IMPORTANT NOTE:  In most cases, multipoint starts by loading a model named
                 null0.mod from the current output directory.  The
                 model currently in memory is ignored.  This is done
                 because it is absolutely essentially that the null0
                 model be the basis to build all multipoint models.  However,
                 some options, such as -ctparm, use the model currently
                 in memory when multipoint is invoked because all
                 models are derived from a custom linkage model that
                 the multipoint command does not necessarily know how
                 to build.

Notes:   1.  Summary output is written to multipoint.out in a subdirectory
             named after the trait variable.  You can set another output
             directory with the outdir command.  Contents of the output
             directory will be purged of previous files at the beginning
             of each invocation if -overwrite is used.

         2.  The final "best" linkage model is link.mod.  In addition,
             a series of additional "null" models is produced, starting
             with null1.mod (containing 1 QTL), null2.mod, etc.  These
             models are produced only if a LOD criterion is specified
             and satisfied (so there is more than one pass).

         3.  If a LOD adjustment is in effect (see lodadj command) it
             is applied here.

         4.  If models have two traits, the 2df LOD scores will be
             converted to 1df effective LOD scores.  To override this, 
             use the lodp command (see).  This feature
             was first included with SOLAR beta version 2.0.1.

         5.  At the beginning of each pass through the selected genome,
             multipoint calls a user script named multipoint_user_start_pass
             which takes one argument, the pass number (which starts at 1
             for the first pass).  Within this routine, the user can change
             the selected chromosomes or interval.

Shortcuts: mul - multipoint

Return to command index

A1.125 needk2


Purpose:  Keep K2 (phi2) terms from MIBD matrices

Usage:    needk2
          needk2 off

Notes:    This command is now obsolescent and should not be used.

          The K2 in MIBD files is obsolescent.  We now maintain
          a separate phi2.gz file for discrete trait analyses, and
          for quantitative trait analyses, the K2 (phi2) values are
          computed as needed.

Old Notes:

          If you need to use any of the K2_* matrix values, issue the needk2
          command before loading the matrix (or running 'multipoint.')

          Normally the K2 values from matrix files are not used because
          they are identical to the K2 values computed by SOLAR as needed.

          The default (of not saving K2) cuts matrix memory usage in half.

Shortcuts: needk - needk2

Return to command index

A1.126 newmod


Purpose:  Start a new model

Usage:    newmod [<trait>]+

          <trait>  Set the trait(s) to this/these trait(s).  (The trait(s)
                   can be specified later.  If not specified here, they
                   become <undefined>.)
Notes:

   (1)   This combines "model new", "outdir -default", and optionally
         trait [<trait>]+ .  This is now preferred to using the separate
         commands, because it is shorter.  For example, the command:

       newmod q1 q2

          takes the place of the commands:

       outdir -default
       model new
       trait q1 q2

          Clearly the "newmod" form is superior, it preserves the
          essential information while reducing redundant keystrokes.

   (2)    Since this clears the outdir, it is adviseable to use this
          command instead of "model new" to be sure that the outdir
          is cleared, and not inheirited from some previous script.
          From now on, the manual advises using "newmod" (and not
          "model new") for this reason.  However, the behavior of
          "model new" itself is unchanged, so that existing scripts
          that operate correctly will continue to operate correctly.
          When combining previously written scripts that use "model new"
          instead of "newmod", the user must be careful to update
          "outdir" status if required.  New scripts using "newmod" will
          not be subject to the error of incorrectly inheiriting an
          unwanted outdir setting.

Shortcuts: newm - newmodels

Return to command index

A1.127 newtcl


Purpose:  Recognize new or changed Tcl procedures in Tcl scripts

Usage:    newtcl

Notes:    At the time a SOLAR session is started, all Tcl scripts
          (files ending with ".tcl") are scanned.  The newtcl
          command forces another such scan in order to recognize
          new Tcl procedures (created AFTER the start of the SOLAR
          session), or to recognize changes to Tcl procedures since
          the first time those procedures were used (see explanation
          below).  You could also accomplish this by exiting from
          and restarting SOLAR, but that is often inconvenient
          because it causes the loss of session state.

          The following directories are scanned by SOLAR for user scripts:

               .      (the current working directory)
               ~/lib  (the lib subdirectory of your home directory, if it exists)

          A procedure found in "." will supercede one found in "~/lib" having
          the same name.  Also beware that if the same procedure name is used
          in more than one script file, the first one encountered will be
          the one actually used.  If the same procedure name is found in two
          files in the same directory, the precedence is not predictable.

          The scanning process simply looks through each script file for
          "proc" (procedure) statements.  An index of all the procedures
          is then written to a file named tclIndex in the working directory.
          This file will only be created if user-defined Tcl scripts are found.

          Tcl procedures are only loaded into SOLAR the first time they
          used.  Once loaded, they stay loaded, and may no longer reflect
          the Tcl files in the scan path if those Tcl files are changed.
          The newtcl command flushes all currently loaded procedures, so
          the next time any procedure is invoked, it will be reloaded from
          the file.

          The main Tcl file used by SOLAR is named solar.tcl and is located in
          the lib subdirectory of the SOLAR installation.  This defines all
          the fundamental procedures used by SOLAR.  User-defined procedures
          having the same name as built-in procedures will supercede them.

Shortcuts: newt - newtcl

Return to command index

A1.128 nifti_to_csv


Purpose: Converts nifti files to solar csv files.

Usage: nifti_to_csv <Mask Volume Name> <Trait Column Name> 
<Phenotype Filename> <Output Filename or Base Filename> [<max voxels per file>]

 Use a binary mask  (Mask Volume) to specify the region from where the voxel wise traits
 will be extracted (mask>0)

Trait Column refers to the column name where the nifti file names are specified. 
Phenotype file is a standard solar format file that includes IDs, covariates and a column
of nifti file names.  
Example
ID, Age, Sex, FA
N001, 24, M, N001_FA.nii.gz 
All other columns in the file will be treated as covariates and included in the output files. 

 N columns refers to the 
 number of traits per out file. It is suggested to keep this number < 10,000

See video instructions on how to use this function at www.solar-eclipse-genetics.org

Return to command index

A1.129 normal


Purpose:  Normal distribution functions

Usage:    normal -i[nverse] <p>

Notes:    Currently, the only supported function is the "inverse normal
          cumulative density function", which maps the open range
          0,1 to the whole real line.  (The values for 0 and 1 are
          out of range because they would be negative and positive
          infinity.)

          This normal function is used by the inormal procedure to
          perform an inverse normal transformation on a dataset.
          For further information, see the help for "inormal".
          In turn, the inormal procedure is part of the mechanism
          behind the "inormal_" prefix which may be applied to
          phenotypes in the define command.

          We will add additional normal distribution functions here as
          we need them.

          Our implementation is indirectly based on:

            Cody, W.D. (1993). "ALGORITHM 715: SPECFUN - A Portabel FORTRAN
            Package of Special Function Routines and Test Drivers"
            ACM Transactions on Mathematical Software. 19, 22-32.

Return to command index

A1.130 null


Purpose:  Return an optimized parameter from null.mod

Usage:   null loglike
         null h2q

Notes:   see also nulln

Shortcuts: null - null

Return to command index

A1.131 nulln


Purpose:  Return an optimized parameter from nullX.mod
           (e.g. null0.mod,  null1.mod, etc.).

Usage:   nulln 0 loglike
         nulln 1 loglike
         nulln 2 h2q1

Shortcuts: nulln - nulln

Return to command index

A1.132 old_zscore


solar::zs

Purpose:  The old zscore command to zscore current trait
         
Old Usage:    zscore [-off] [-q]
              zs     [-off]       ;# Perform zscore quietly

              -off                Turn off zscore
              -q                  Perform zscore quietly

Notes:    The "Mean" and "SD" values used by zscore are computed only
          once, at the time the zscore command is given.  Thus they do
          not reflect later changes to the phenotypes file, or to the
          sample, which might be restricted due to individuals missing
          covariates added later.  Generally, for this reason the
          zscore command should be given after the covariates command
          and immediately before a model maximizing command such as
          polygenic.

          Starting with SOLAR Version 4.0.9, the trait mean and SD
          are computed from the actual sample that would be included
          in an analysis (at the time the zscore command is given).

          As described in the notes below, you can adjust the Mean
          and SD by using "option zmean1" and "option zsd1" to set
          the values actually used.  These values are applied to
          the trait values during maximization.

          If the trait is changed without giving the "model new"
          command, the new trait will be zscored automatically.
          This feature is obsolescent.  In a future update, zscore
          will be turned off when the trait is changed.

          An alternative to zscore is to define the trait as the
          inverse normal transformation of a variable.  See
          "help inormal" and "help define" for further details.
          
          zscore will also calculate a number of statistics
          for the trait: mean, minimum, maximum, standard deviation,
          skewness, and kurtosis.  These will be written to the file
          zscore.out in the current output directory.  As of version
          4.0.9, these statistics are no longer written to the terminal.
          Instead, a single line is displayed with the trait name,
          mean, and SD.  Even that line is not shown if zscore is
          invoked from a script or the zs abbreviation of the command
          is used.

          To calculate these statistics for any phenotypic variable without
          zscoring and without necessarily making it the trait, use the
          "stats" command instead.

          A trait must already have been selected with the trait command
          or loaded model.  Also the phenotypes file must have been loaded.

          When a maximization is performed, trait values are replaced with
          their zscored values.  The formula is:

          zscored = (value - Mean) / SD

          zscore is a model dependent option controlled by "option zscore".
          It remains in effect until another model is loaded or the
          "model new" command is given.  When models maximized with zscore
          are reloaded, zscore is again activated.

          "option zscore" is set to 1 ("on") by this command, and the
          related options zmean1 and zsd1 (mean and standard deviation
          for the first trait) and zmean2 and zsd2 (mean and standard
          deviation for the second trait) are set as required.  You can
          adjust these options directly to fine tune the mean and standard
          deviation values used, but be sure that zscore is not set to 1
          until the mean and (non-zero !) standard deviation values are
          set for all traits in the model.

          In a multivariate model, zscore will only be applied to the
          first two traits.

          Whenever zscore is activated or deactivated, parameters mean
          and SD are reset to zero to force setting new boundaries and
          starting point during the next maximization.

          If a new phenotypes file is loaded, the zscore command should be
          repeated to reflect the new file.

Return to command index

A1.133 omega


Purpose:  Sets the Omega (Covariance) equation directly

          Important: By default, SOLAR (starting with version 2.0.2)
          sets up the omega automatically when you give the "trait"
          command, and subsequently modifies it as required during
          polygenic, multipoint, and other commands.  It is only
          necessary for the user to use the omega command for
          special purpose advanced analyses.

Usage:    omega = <expression>    ; sets Omega equation
          omega                   ; displays Omega equation
          omega reset             ; reset default Omega equation

Notes:    The default omega for a SOLAR model with one linkage element is:

          omega = pvar*(I*e2 + phi2*h2r + mibd1*h2q1)

Notice that each term inside the parentheses has a matrix and a parameter.
Some of the matrices loaded from external files and some are computed
internally.

The built-in variables are:

          pvar...........Phenotypic variance.  This is the square of the SD
                         (trait standard deviation) parameter.

          I.............Identity matrix, which equals 1 when pair of
                        individuals is the same individual, 0 otherwise.

          phi2..........Two times the kinship coefficient, for quantiatitive
                         models this is normally computed internally and 
                         on-the-fly to reduce storage requirements.  
                         Also found (identically) in the phi2.gz matrix file
                         created by "load pedigree," SOLAR uses the 
                         phi2.gz by default for discrete traits, or the usage
                         of the external file can be forced by using the
                         "loadkin" command or giving a suitable "load matrix"
                         command such as "load matrix phi2.gz phi2 delta7".

          delta7........Dominance coefficient, equivalent to Jacquard's
                        delta7 from the series delta1-delta7 when there
                        is no inbreeding.  If there is inbreeding, this
                        should not be used.  As with phi2, this is, by
                        default, computed internally and on-the-fly for
                        quantitative models unless "loadkin" or a 
                        comparable "load matrix" command is given.  The
                        delta7 matrix is the 2nd matrix within the phi2.gz
                        file.  This matrix should only be loaded if it is
                        needed because it is usually not used by SOLAR.

          male_i........1 if individual "i" is male, 0 otherwise.
          male_j........1 if individual "j" is male, 0 otherwise.
          female_i......1 if individual "i" is female, 0 otherwise.
          female_j......1 if individual "j" is female, 0 otherwise.
          si............index of trait i (1..ntraits)
          sj............index of trait j (1..ntraits)

          For <phenotype> which is the name of a phenotypic variable:

          x_<phenotype>.....sample mean for <phenotype>.  For example, x_age.
          min_<phenotype>...minimum sample value of <phenotype>.
          max_<phenotype>...maximum sample value of <phenotype>.
          <phenotype>_i.....value of <phenotype> for individual "i".
          <phenotype>_j.....value of <phenotype> for individual "j".

          For <parameter> which is the base name of a parameter:

          <parameter>(ti)...replace "ti" by name of trait of individual "i".
                            For example, h2r(ti) may be h2r(weight) in
                            an analysis of traits height and weight.
          <parameter>(tj)...replace "tj" by name of trait of individual "j".

          teq...............trait for individuals i and j is the same.
          tne...............trait for individuals i and j are not the same.

          Matrices may also be used, along with math operators
          + - * / and also ^ (power) and () (parentheses), and also
          all math functions defined by the C Programming Language
          which includes "log" for natural logarithm, trig functions,
          and hyperbolic functions, among others.  Here is a list:
          erfc, erf, lgamma, gamma, j1, j0, y1, y0, rint, floor, ceil, 
          tanh, cosh, sinh, atan, acos, asin, tan, cos, sin, expm1, exp,
          logb, log1p, log10, log, cbrt, sqrt, and abs.

          Parameter names with special characters should be enquoted using
          angle brackets so the entire name is enquoted, including any
          prefix string.  For example, given a variable named age.onset with
          dot, the mean value of the variable could be indicated with
          <x_age.onset> in angle brackets as shown.  This is the same as
          the rule used by the define and constraint commands.

          Beginning with version 3.0.4, the following equalities and
          inequalities may also be used between two terms.  If the
          operator is true, 1 is returned, otherwise 0 is returned.
          This enables you to construct compound conditional expressions
          having the same effect as could have been done with "if"
          statements.  The C operators < and > have been replaced with
          << and >> so as not to be confused with the <> quotation of
          variable names in SOLAR.

          C Format    Fortran Format    Test
          --------    --------------    ----

          ==          .eq.              if equal
          !=          .ne.              if not equal
          >=          .ge.              if greather than or equal
          <=          .le.              if less than or equal
          >>          .gt.              if greater than
          <<          .lt.              if less than

          Example of use of inequalities:

              omega = pvar * (I*e2 + (h2r >= 0.125)*phi2*h2rc + \
                (h2r < 0.125)*phi2*h2rd)

          Beware that comparing the equality or inequality of two floating
          point numbers sometimes does not work as expected due to
          numerical representation limitations.  For example, 1/3 might
          not equal 2/6.

          The precedence of the equality and inequality operators is below
          that of all other operations, so their expressions should be
          in parentheses as in the example above.

          There is also a function named "print" which simply prints
          the value of its argument, which may be any expression, and
          then returns that value.  For example, in place of the standard
          univariate omega

              omega = pvar*(phi2*h2r + I*e2)

          You could have:

              omega = pvar*(print(phi2) + I*e2)

          and this would print each phi2 value as it is used.  An
          expression may include any number of print functions, and
          they are evaluated in the standard order of evaluation,
          starting with the innermost subexpression.  If you simply
          want to print some value without including it in the rest of
          the expression, you can multiply the print function by zero,
          for example:

        omega = pvar*(phi2*h2r + I*e2 + 0*print(delta7))

          At this time, the print function can only print one number,
          without any identifying string.  After each value is printed,
          you must press RETURN to procede to the next, or you can hold
          down RETURN to pass through a lot of prints.

For multivariate models which have 3 or more traits, the automatically
created standard omega includes generic rho parameters rhoe_ij, rhog_ij,
and rhoc_ij if household effects, and rhoq1_ij (to rhoq10_ij) for linkage
models.  When the omega is evaluated, the i and j are replaced with the
trait indexes.  For example, with traits 1 and 2 rhoe_ij becomes rhoe_12.
It is possible to write omegas without these generic rhos if desired.

There are also 4 additional generic rho's available for custom usage:
rhoa_ij, rhob_ij, rhod_ij, and rhof_ij.


Shortcuts: ome - omega

Return to command index

A1.134 option


Purpose:  Set or read the value of model-specific options.

Usage:    option <option name> <value>    ; sets option value
          option <option name>            ; reads option value
          option                          ; shows all option values

Notes:    ibd-specific options are set by ibdoption.

          Most options control fairly obscure aspects of SOLAR operation 
          and are not normally changed directly by SOLAR users.  Many are 
          automatically controlled by other SOLAR commands during
          normal operation.

          Model-specific options are stored in saved model files.
          Starting a new model (with the "model new" command) will
          reset all options to the default value.  Loading a model
          will reset all options to the default, then load the options
          in the model file.  It is recommended to specify options
          after specifying the trait and covariates, but before
          giving the "polygenic" or other model maximizing command.

          Warning: RARELY USED options may have become buggy in the 
          context of more current features since they haven't been recently
          tested.

          Here is a list of options and their default values:

   CMDiagonal 0      Covariance matrices diagonal (1=yes; 0=no; automatically
                       set to 1 for sporadic models, 0 for all others)

   StandErr 1        Compute Standard Errors (1=yes; 0=no; defaults to 1
                       except during a multipoint scan where it is set to
                       0 while scanning to improve speed)

   ModelType Default    Determines the type of maximization performed.

             Default    Default will peform standard quantitative
                        maximization if univariate quantitative and
                        discrete/mixed mode otherwise.

             Evd        Original Evd1 modeltype computes eigenvalues
                        for the kinship matrix only which requires
                        more ongoing processing, but for pedigrees
                        less than 1000 may be faster.  (Note: the
                        mga command by default uses Evd.)

             Evd2       EVD2 processing peforms complete transformation
                        of the trait and all covariates.  This requires
                        more upfront processing but may be faster for
                        pedigrees larger than 1000.

            AutoPhi2    Do not use this option unless you know you need
                        it.  It is not generally needed for new work.

                        Beginning with Version 8.3.1, the phi2.gz
                        matrix is loaded during maximization if
                        phi2 is not already loaded by the user.  That
                        was already the default for many kinds of
                        maximization, just not the simplest univariate
                        quantitative.  The AutoPhi2 modeltype brings back
                        the original on-the-fly calculation of phi2, if
                        no phi2.gz had been loaded.  That is often not
                        what is intended, but sometimes people who
                        had not saved their phi2.gz were relying on
                        it.
                        
   StandLogLike 0    Standardize Loglikelihood (0=no; 1=yes) RARELY USED.

   AutoCovarBound 1.25   Factor used in estimating covariate boundaries:
                           AutoCovarBound*(tmax-tmin)/(cmax-cmin)
                           This is a fairly wide but useable estimate
                           which has never needed adjustment.

   Grid 0            Method used in likelihood estimation 0=Search; 1=Grid
                       RARELY USED, and not to be confused with grid command
   GridPoints 1      Points to be used if Grid option (above) is applied

   MaxIter 1000      Maximum Iterations allowed in a loglikelihood search
                       (If you need more than this, something is probably
                       wrong.  Usually only 10-20 iterations of searching
                       is sufficient.  MaxIter is to prevent SOLAR from
                       iterating forever in troublesome cases.)
                       
   Outlier 0         0=keep; 1=remove outliers (It's probably better to
                       to remove them yourself than rely on this RARELY USED
                       option.)
   CutPeople 0.05    Factor used to remove outlying people
   CutPed 1.0        Factor used to remove outlying pedigrees

   TDist 0           Automatically set by the "tdist" command.  Don't set
                       this option yourself unless you are an expert.  Use
                       the tdist command instead, which sets this option
                       and sets up the required parameter for you.

   Conv 1.0E-6            Convergence improvement factor for quantitative
                            models (For experts only!  See Chapter 6.)
   Conv(Discrete) 1.0e-4  Convergence improvement factor for discrete models
   NConv 4                Convergence count (For experts only!) Conv has to
                            satisfied this many times.
   Tol 1.0E-8             Tolerance for parameters (for experts only!)
   MaxStep 5              Maximum steps (for experts only!) This many
                            decrements are allowed in the attempt to improve
                            loglikelihood.  This may need to be increased in
                            troublesome cases
   BCliffs 0.1            Backup from NaN cliffs by this factor (for experts
                            only!) currently used only for discrete models
   MaxCliffs 15           Maximum steps to backup from NaN cliffs (for
                            experts only) currently used only for discrete

   ScoreOnlyIndex -1   Automatically set by "multipoint -score"; otherwise 
                         don't touch

   MergeHousePeds 1    Merge pedigrees sharing households (1=yes; 0=no)
                         Necessary for accurate C2 estimation; 1 is default.
   MergeAllPeds 0      Merge ALL pedigrees  (Earlier merging method; use
                         only if MergeHousePeds fails inexplicably)

   RobustEst 0         Robust Estimation (new and barely tested)
   Tune 3              Factor used with robust estimation

   PedSelect 0         Select only this pedigree for maximization.  The
                       zero default means "all pedigrees."  Otherwise
                       use integer to select pedigree as indexed in
                       pedindex.out created by "load pedigree."
                       Alternatively use commands "pedlike" and
                       "pedlod" to get pedigree specific likelihoods
                       and lods using the same parameter values. It
                       is also possible to select multiple pedigrees
                       by using + operator: option pedselect 1;
                       option pedselect + 2; and so on.  + is optional
                       before the first selection.  The list of selected
                       pedigrees is saved to model file, and can be
                       cleared out either with "option pedselect 0" or
                       "model new."

   EnableDiscrete 1    Use SOLAR "Discrete Trait" modeling if trait is
                       found to be discrete (2 integer values separated
                       by one, e.g. 0,1).  (0) means use quantitative
                       modeling regardless of inferred trait type.

   DiscreteOrder 1     Ordering for discrete pedigrees.  You are strongly
                       discouraged from changing this by Dr. Jeff Williams
                       who has done considerable study of discrete trait
                       modeling.  The default ordering (1) puts affecteds
                       first if prevalence < 0.5, else unaffecteds first.
                       No ordering is done for (0).  (-1) reverses the
                       standard ordering.  (2) does per-pedigree ordering.
                       (-2) does per-pedigree reverse ordering.  Per-pedigree
                       ordering is not available for multiple traits.  Set
                       DiscreteOrder to 3 to an create output file named
                       "discrete.out" in output directory containing up
                       to 7 variables of the ordered data.

   DiscreteMethod 1    Version of discrete code used.  The default (1)
                       seems the most robust, and use of the alternate
                       method (2) is discouraged.

   UnbalancedTraits 1  Default is to use "unbalanced traits" in bivariate
                       models.  Individuals will be included in the
                       analysis if they have either trait; they do not have
                       to have both.  Individuals are converted to
                       individual-trait's.  (0) Excludes individuals
                       missing either trait.  (-1) Also exclude individuals
                       missing either trait, use "bivariate" feature
                       built-in to Search.f.  

   EnforceBounds 1     Numerical errors during the quadratic-solving phase
                       of maximization can result in overshooting parameter
                       boundaries by a tiny amount.  The default enforces
                       boundaries whenever a new "point" of parameter values
                       is computed.  This was essential for bivariate models
                       to prevent attempted square roots of tiny negative
                       numbers.  (0) turns this feature off.  This is
                       not effective during some phases of maximization,
                       such as estimation of standard errors, where the
                       point is not computed in Search.f, which is why
                       the AbsVarianceParms option was added.

EnforceConstraints 0   Experimental.  (1) turns on attempted enforcement of
                       constraints when the derivatives of the tableau
                       become too large to maintain constraint numerical
                       accuracy better than 1e-4.  (Otherwise, convergence
                       may fail with the "Numerical constraint failure"
                       note.)  Unfortunately, when this is done, models are
                       usually so out of whack for some reason or other that
                       convergence will ultimately fail by exceeding the
                       maximum iteration limit anyway.  (-1) turns off the
                       constraint numerical accuracy test at the end of
                       maximization.  The default (0) tests constraint
                       accuracy at the end of maximization only.

   CorrectDeltas 0     Experimental.  1 turns on attempted correction of
                       deltas for numerical errors during the quadratic
                       problem solving phase of maximization.  The default
                       (0) leaves this turned off because the default 
                       EnforceBounds option accomplishes the intended
                       result more efficiently in most cases.

   AbsVarianceParms 1  The default (1) forces the the abs() function to be
                       applied to all known parameters used in the omega in
                       bivariate maximization.  This prevents tiny negative
                       values from causing NaN's to arize from the
                       application of square root.  (-1) forces the abs() 
                       function to be applied in univariate maxmization as
                       well.  That might cause trouble with some discrete
                       models.  (0) forces the use of the actual parameter
                       values in the omega, negative or not.

   BounDiff 0          This option controls the method used to compute
                       "forward" and "central" differences in likelihood
                       (derivatives) when a parameter is at a boundary.
                       (Currently, this option is applied only in discrete
                       trait modeling where it has been shown to be
                       necessary to ensure complete maximization.)

                       The default (0) prevents the parameter from going
                       beyond the boundary if the boundary is -1 or 0 for
                       lower bounds and 0 or 1 for upper bounds since 
                       these are the typical "hard" boundaries for variance
                       components, correlations, roots, and logs.  Otherwise,
                       parameters are allowed to go beyond boundaries
                       during derivative calculation by very tiny amounts.

                       (1) always enforces the boundaries.
                       (-1) never enforces the boundaries, allowing
                       tiny excursions beyond them in all cases (which,
                       if possible, might give the best results).  
                       (2) applies the historic rule: only upper boundaries
                       are (always) enforced, but only for forward
                       differences, and the simple method of substituting
                       the negated backward difference is used.
                       For options (0) and (1) a more sophisticated
                       algorithm is used when otherwise the boundary 
                       would be crossed illegally.  An inner difference
                       is taken AND adjusted for the slope of the next
                       nearest change in differences.  In other words,
                       the second derivative is used to compute an
                       expected first derivative.

   PedLike 0           Intended for use by pedlike and pedlod commands only.
                       Produces files "pedexclude.dat" and "pedlike.dat"
                       during maximization.

   SampleSameTrustMe   This option declares to SOLAR that the sample for
                       the previous model is identical to the current model.
                       Therefore, the same EVD matrices may be used, and
                       the default checking that is done to determine if the
                       sample is the same is bypassed.  If this option is
                       specified before EVD matrices have been created for
                       any model, the required matrices are created, but the
                       storage of additional data to determine when the
                       sample changes is bypassed.  Thus maximization speed
                       of all EVD models is increased, but only slightly,
                       about 1%.  The downside is that if the user has
                       specified this option in error, the results will be
                       wrong and SOLAR might even crash.  Hence the suffix
                       "TrustMe".  THIS IS A SPECIAL NO-WRITE OPTION which is
                       not written to model files, because when the model
                       is reloaded it may not be applicable at the time it
                       is reloaded.  It should be used only in scripts
                       where the null and test models are absolutely certain
                       to have the same sample.

   evdphase 0          If option modeltype is evd2, the evdphase option
                       is used internally to handle the 3 model phases.
                       In the evdphase 1, the sample is determined and
                       evddata.out is written.  In the second phase a
                       model is created using evddata.out for traits and
                       data with all unrelated individuals and this model
                       is actually maximized using standard quantitative
                       maximization.  In the third phase the maximized
                       model is translated back into the original parameters.

   Eigenvectors 0      If set to 1, the eigenvectors from EVD will be written
                       to files.  To use this option, use the "evdoutev"
                       command (which invokes this option).

   EVDmat 0            If set to a non-zero value, this writes EVD data
                       suitable for fphi or similar procedure to evddata.out.
                       The correct way to use this option is by using the
                       "evdout" command which invokes it.

   FPHIMethod 1        This sets the method for which EVD data should be
                       written.  The correct way to use this option is by
                       using the "fhpi" command which has a "-method2"
                       option, however the default Method 1 works better now.

   ResetRandom 0       If this option is set to a non-zero value, the
                       random number sequence used during maximization will
                       be set to its initial default value at the beginning
                       of maximization.

   dontallowsamplechange 0   If option modeltype is evd, and this option is
                             set to 1, model maximization will terminate
                             prematurely with an error message if the
                             the sample changes from the previous evd
                             model.  If modeltype is evd, and this option
                             is defaulted at 0, the evd code will simply
                             create new matrices as required.  If the
                             option samplesametrustme is 1, this option
                             has no effect unless the size of a pedigree
                             changes.  If the option modeltype is not
                             evd, this option has no effect.
                        
   singulartrait 0     If set to 0, a trait having only one non-blank value
                       in the entire phenotypes file is an error which will
                       halt maximization, because most of the time when this
                       happens it is because there is a user mistake which
                       needs to be corrected.  If set to 1, such a trait
                       will be considered the "unaffected" value of a
                       discrete trait and discrete trait maximization will
                       be performed.  If set to 2, the trait will be
                       considered a quantitative trait and quantitative
                       trait maximization will be performed, but then you
                       must also preset fake upper and lower bounds for
                       the mean and sd parameters because the normal
                       algorithm for guessing them doesn't work in this
                       case.  Use the "parameter" command to preset these
                       boundaries. The latter two options are generally only
                       useful in the presence of covariate variation.

   PolyClasses ""      The polyclasses option is modified by the polyclass
                       command, users should not modify it.  It becomes
                       a comma separated list of all the classes in a
                       polyclass model.  This is created so that maximize
                       can handle discrete trait and mixed trait models by
                       locating the SD parameters for each trait.
                       
   ParameterFormat 16  number of significant digits used for writing
                       parameter values to model files and queries.
                       Prior to Version 6.3.6, the default was 10.
                       16 is a compromise value that displays almost
                       all the precision in a double precision floating
                       point number, while avoiding representational issues
                       that cause a string of 9's to appear at the end of
                       a number.  The largest useful value is 17, that
                       always shows all information available, but sometimes
                       at the expense of being very ugly.

   MatrixNumberFormat 15  number of significant digits used for writing
                          results of matrix operations.  15 works best.

   ExpNotation 0       1 forces exponential notation (but only for certain
                       commands, mga is the only one currently).  0 is
                       auto mode, which typically uses fixed point while
                       a few digits of precision are shown, then flips to
                       exponential notation if required to show a nonzero
                       value.

  ShuffleReseeding 1   1 gives consistent repeatable results by reseeding
                       random generator at the beginning of each mathmatrix
                       shuffle to default seed.  Options are:

    1...seeded on every shuffle to 5489u for consistent results (DEFAULT)
    0...seeded first shuffle to 5489u, then free running
   -1...seeded every shuffle to time() for purely stochastic results
   -2...seeded first shuffle to time(), then free running
    Other values: seed to this value at beginning of each shuffle

IMPORTANT NOTE: the zscore options below pertain to the now obsolescent
zscore command implementation.  Now it is preferred to use the zscore_
prefix operator with the define command.

   zscore 0            The default mode has zscore deactivated.  Activation
                       by setting this to a non-zero number should be done
                       by the zscore command and only when the additionally
                       required z options have been set.

   zmean1 0            When zscore is active, zmean1 should be set to the
                       expected mean value for trait 1.

   zsd1 0              When zscore is active, zsd1 should be set to the
                       expected standard deviation for trait 1.  This option
                       must not be zero when zscore is non-zero.

   zmean2 0            When zscore is active, zmean2 should be set to the
                       expected mean value for trait 2.

   zsd2 0              When zscore is active, zsd2 should be set to the
                       expected standard deviation value for trait 2.  This
                       option must not be zero when zscore is non-zero.


Shortcuts: opt - options

Return to command index

A1.135 outdir


Purpose:  Set maximization output directory (overriding default)

Usage:    outdir <dirname>
          outdir                   ; shows current outdir
          outdir -default          ; restore default: (trait name)

Notes:   By default, solar models and related output are written to
         the maximization output directory.  By default, that directory
         is named after the trait.*  For bivariate models, the trait
         names are separated by a period (".").

         The default output directory can be overridden by this command.
         Once set, it stays overridden until the "outdir -default" command
         is given, or a new SOLAR session is started.

         (*The directory will be named after the trait as entered in the
          trait command, rather than as it exists in the phenotypes file.
          For example, it will be named 'foo' if the command 'trait foo'
          has been given, even if the variable is actually named FOO.)

         To prepend the name of the maximization output directory to
         any filename, use the "full_filename" command.

Shortcuts: outd - outdir

Return to command index

A1.136 parameter


Purpose:  Create, modify, or list parameter(s)

Usage:    parameter <name>                ; display or create a new parameter
          parameter <name> start <value> lower <value> upper <value>
                                          ; set parameter start and 
                                          ;   boundaries
          parameter <name> =              ; return current parameter value
          parameter <name> = <value>      ; set current (start) value
          parameter <name> start <value>  ; set current (start) value
          parameter <name> se             ; display last computed std error
          parameter <name>                ; display all parameter information
          parameter delete <name>         ; delete parameter
          parameter                       ; display all parameters
          parameter -return               ; return parameter info in a list
          parameter fixupper <value>      ; set upper bound as fixed
          parameter fixlower <value>      ; set lower bound as fixed
                                         ; fixed bounds are not auto adjusted

Notes:    (1) The most commonly required standard parameters are created
              automatically when you give the "covariate" and "polygenic"
              commands.  Their starting points and boundaries are also
              set automatically either then or at the start of maximization
              by fairly reliable heuristics.  Boundaries are set
              automatically only if both boundaries are set to zero at the
              beginning of maximization, so if you preset one boundary,
              be sure to set the other.  The "standard" parameters include:

                  mean (for trait)
                  SD (standard deviation)
                  e2
                  h2r
                  h2q1
                  c2
                  b* (covariate beta)

          (2) "start" and "=" are identical operators.  "=" is simply a
              more convenient and mnemonic name in most cases.  Once
              a maximization has been performed, the "start" or "="
              value will actually be the maximimum likelihood estimate.
              It will then be the starting value for the NEXT maximization.

              Note that when you are setting the starting value for a
              parameter, you must surround the "=" used for assignment
              with spaces.  For example:

                  parameter h2r = 0.1

              If you did not surround the = with spaces, it would appear
              that you were simply trying to create a new parameter named
              h2r=0.1.  To prevent this kind of mistake, such names are
              not allowed.  See note 4 below.

          (3) When a parameter is deleted, any constraint it appears in is
              deleted also.  This behavior is obsolescent.  In the future,
              a single TERM may be deleted from the constraint instead.
              In the meantime, it is recommended to edit constraints to
              remove parameters slated for deletion, THEN delete the
              parameter.

          (4) When naming parameters, you are advised to stick to the
              usual alphabetic, numeral, and underscore characters unless
              there is a good reason not to.

              However, other special characters may be actually allowed in
              order to accomdate all automatically created parameter
              names.  The use of some of these characters will
              force the requirement that these parameters be quoted in
              <> when used in constraint, omega, and mu commands so as
              not to imply math operations.

Shortcuts: par - parameters

Return to command index

A1.137 ped2csv


Purpose:  Convert Pedsys format file to comma delimited format

Usage:    ped2csv <pedfilename> [<outfilename>]

          If <outfile> is not specified, filename is <pedfile>.csv

Notes:    BLANK fields are removed.  Duplicate field names will cause an
          error.

          This command uses the "selectrecords" command, which makes it
          very easy: "selectrecords <pedfilename> <outfilename>".  Since
          no condition is specified, all records are selected, and since
          selectrecords uses the tablefile command, it can read pedsys files.

Return to command index

A1.138 pedifromsnps


solar::build_grm --
Purpose: Creates a empirical pedigree matrix from a plink data set 

Usage: pedifromsnps -i <input base name of plink data> -o <output csv file name>
       --freq <file made with plink_freq>
       [optional: -corr <alpha value>  -per-chromo -king -method_two -normalize
  -batch_size <batch size value> -id_list <file w/ subject IDs>
  -n_threads <number of CPU threads>]

 -i The base file name of the plink .bed, .bim, and .fam files.
 -o The base file name for the output.
   -freq Name of output file from plink_freq command.
   -n_threads Number of CPU threads used for matrix calculation. 
      Default: Automatically set based on hardware
   -per-chromo Outputs a separate matrix for each chromosome. Default: Disabled
   -corr <alpha value> Compute method one correlation GRM using 
  	this alpha value. Default: -1
	 -method_two Computes correlation GRM using a second method 
	described below. Default: Disabled
   -king Computes Robust King GRM instead of using a correlation method.
	Default: Disabled
 -batch_size <batch size value> Number of loci computed at a single
	time per CPU thread. Default: 500
   -id_list <file w/ subject IDs> Specified file contains a list of 
		subject IDs separated by spaces.  The resulting GRM will 
	only use these IDs and excluded all others. Default: All IDs are used
 -normalize When used during the creation of a correlation GRM the final values
	 	are normalized using the square roots of the diagonal values.  The result
	being that diagonal elements are 1 and off-diagonal elements are bounded
	by 1 and -1.  Z*_i_j = Z_i_j/sqrt(Z_i_i*Z_j_j) where Z* is the final value
	and Z is the unnormalized value, i refers to the index of subject i while
 	j refers to the index of subject j.

 Default method for GRM creation is correlation method one with default alpha = -1.
 
Correlation Method One:  
Z_i_j = 1/m * sum_k=0 to k=m (SNP_i_k - 2*f_k)*(SNP_j_k - 2*f_k)*(2*f_k*(1-f_k))**alpha

Z = GRM value

i = subject row index

j = subject column index

k = locus index

SNP_i_k = SNP value for subject i and locus k (must be 0,1,or 2)

f_k = major allele frequency of locus k

m = total number of loci not including loci that are missing values for 
  subjects i or j

      alpha = exponent for variance, best set to default value of -1 

Correlation Method Two:
	Z_i_j = 1/variance_sum * sum_k=0 to k=m (SNP_i_k - 2*f_k)*(SNP_j_k - 2*f_k)

Note: Variables are the same as above except for variance_sum

variance_sum = sum_k=0 to k=m 2*f_k*(1-f_k) 
Note: Locus index k is excluded if there is a missing SNP value for 
            subject i or j.  Therefore the variance_sum calculated 
            for one element of the GRM may differ from another element. 

Robust King Method:
Z_i_j = 1 - (sum_k=0 to k=m (SNP_i_k - SNP_j_k)**2)/(2*min(N_i,N_j))

N_i = number of heterozygotes for subject i
N_j = number of heterozygotes for subject j
Note: Heterozygotes for subject i or j are not included if the other
      subject is missing a value at that locus.

For information on robust king method see Manichaikul A, Mychaleckyj JC, 
Rich SS, Daly K, Sale M, Chen WM  
      (2010) Robust relationship inference in genome-wide association studies. 
 Bioinformatics 26(22):2867-287
       

Return to command index

A1.139 pedigree


Purpose:  Process the pedigree data.

Usage:    load pedigree <filename> [-founders]   ; loads pedigree file
          load epedigree <filename> [-t <threshold>] [-1] ; see below **
          pedigree show [all | <ped#>]           ; displays pedigree data
          pedigree classes [-full [-nowarn] [-phi2]] [-model [-meanf]]
                                         ; displays relative-class counts

          When a pedigree file is loaded, each individual in the file is
          assigned a unique integer identifier for internal use by SOLAR.
          The mapping from permanent IDs to integer IDs is stored in the
          file 'pedindex.out'.  Therefore, loading a pedigree data file
          named 'pedindex.out' is not allowed since that would result in
          the pedigree file being overwritten.  Attempting to re-load
          a previously created 'pedindex.out' under a different name will
          not work either; see Section 8.2.1 in the Manual for discussion.

          If the pedigree file contains founders only, i.e. a set of
          unrelated individuals with no parental data, parent ID fields
          are not required.  In this case, the '-founders' option must
          be included in the load command.  If this option is specified
          but the pedigree file does contain parent ID fields, those
          fields will be ignored

          ** Beginning with version 8.3.0, empirical pedigrees may be loaded.
          Empirical pedigrees are csv files which are representations of
          a kinship matrix, having kinship values for pairs of individuals.
          Empirical pedigrees may be loaded either with the regular
          "pedigree" keyword or "epedigree".  However if "load pedigree"
          is used, and the file is not obviously an empirical pedigree,
          an uninformative error message may result.  It must have
          fields IDA, IDB, and KIN specified in the header line to
          be recognized as an empirical pedigree.  Additional options
          -t <threshold> and -1 are available.  Further discussion
          of empirical pedigrees is given under "help epedigree."

          If the keyword 'all' is given in the 'pedigree show' command,
          detailed info is displayed for all pedigrees.  If a pedigree
          number is specified, detailed info is displayed for that
          pedigree only.  If no argument is given, the show command
          displays summary information.

          The 'pedigree classes' command displays a tally of the relative
          classes present in the pedigree data.  By default, the counts
          for relationships of 3rd degree and higher, as well as some
          1st and 2nd degree relationships, are combined.  If the '-full'
          option is included, then the counts for all relative classes
          are given separately.  In this case, a warning message will be
          displayed if any of the relative classes cannot be handled by
          SOLAR's native method for computing multipoint IBDs.  The
          '-nowarn' option will turn the warning off.  If the '-phi2'
          option is included with the '-full' option, an additional
          column will be displayed which contains the kinship coefficient,
          multiplied by 2, for each relative class.

          If the '-model' option in included in the 'pedigree classes'
          command, the relative class tallies will include only those
          pairs of individuals who both enter the polygenic analysis
          specified in the null0 model for the current trait.  The
          '-full' and '-phi2' options work as described above, but the
          '-nowarn' option is superfluous since the warning message
          described above is never displayed.  When the '-meanf' option
          is included with the '-full' option, the "Mean F" statistic
          is calculated and displayed.

          The state of the currently loaded pedigree data is stored in
          the file 'pedigree.info' in the current working directory.
          This file persists between SOLAR runs, which means that the
          pedigree data which is loaded at the end of a session will
          still be loaded the next time SOLAR is invoked (from within
          the same working directory.)

Notes:    The pedigree load command creates several files in the current
          working directory:

             pedindex.out   maps each ego ID to a sequential ID (IBDID)
                              assigned by SOLAR
             pedindex.cde   PEDSYS code file for pedindex.out
             phi2.gz        gzipped file containing the kinship matrix
                              multiplied by 2
             house.gz       gzipped file containing the household matrix

          The household matrix file will be created only if a household
          ID field is present in the pedigree file.

          The files listed above are specifically created for the
          pedigree file being loaded.  They will be deleted when a new
          pedigree file is loaded.  Hence, a different working directory
          must be used for each data set.

For a description of the pedigree file, enter 'file-pedigree'


Shortcuts: ped - pedigrees

Return to command index

A1.140 pedlike


Purpose:  Calculate pedigree-specific loglikelihoods

Usage:    pedlike [-q] [<model>]

          -q    (quiet) Supress output to terminal

Notes:    Default model will be current model, if current model has
          been maximized.  If changes have been made to current model
          since the last maximization, results may not be predictable.

          If current model has not been maximized, default model is
          the null0 model in current outdir.

          Results are written to "pedlike.out" in the outdir
          and also shown on terminal with some additional summary info.

          The pedigree numbers used are based on SOLAR's pedigree
          index "pedindex.out" created by the "load pedigree"
          command.  These do not necessarily correspond to user
          pedigree numbers (and there is not necessarily even
          a one-to-one correspondence).  Refer to pedindex.out
          to associate your ID's with the pedigrees associated
          by SOLAR.  (Note: pedindex.out has a code file pedindex.cde
          and is best read using PEDSYS.)

    

Shortcuts: pedli - pedlike

Return to command index

A1.141 pedlod


Purpose:  Calculate pedigree-specific LOD scores

Usage:    pedlod [<test-model> [<null-model>]]

Notes:    If no model is specified, the model currently in memory
          is used as the test-model (useful if you have just run
          multipoint or twopoint), and its null-model (having
          one less linkage element) is used as the null model.

          If only one model is specified, the null model is taken
          from the outdir after the specified model is loaded.

          The pedigree numbers used are based on SOLAR's pedigree
          index "pedindex.out" created by the "load pedigree"
          command.  These do not necessarily correspond to user
          pedigree numbers (and there is not necessarily even
          a one-to-one correspondence).  Refer to pedindex.out
          to associate your ID's with the pedigrees associated
          by SOLAR.  (Note: pedindex.out has a code file pedindex.cde
          and is best read using PEDSYS, but may also be read fairly
          well as a text file if PEDSYS is not available.)

          Note that the LOD score calculation may be affected by the
          number trait(s), and the lodp options.  See the documentation
          for the "lodp" command for further details.  When applicable,
          SOLAR converts 2df bivariate LODs to "1df effective" LODs.

Shortcuts: pedlo - pedlod

Return to command index

A1.142 perdelta


Purpose:  Set delta used by perturb

Usage: perdelta <number>

Notes: Defaults to 0.001

Shortcuts: perd - perdelta

Return to command index

A1.143 perturb


Purpose:  Perturb starting values for E2, H2r, and H2q's at bounds

Usage:    perturb

Notes:   perturb is specially tailored to the standard parameterization
         of e2, h2r, h2q1, etc.  perturb does nothing silently if
         parameters e2 and h2r are not present.

         It is no longer necessary or possible to specify h2qindex as an
         unqualified argument as in earlier versions.  If an unqualified
         argument is specified, it is ignored.

         This is used automatically by the 'linkmod' script, and therefore
         also by multipoint and twopoint.

         perdelta is the quantity used in the adjustment (this may be set
         with the perdelta command).  It defaults to 0.001

         Accumulated deltas are distributed only to other parameters whose
         values are 3*perdelta away from the relevant bound, and then only
         in the perdelta quantity.

         This does not handle conditions where parameters exceed boundaries
         by more than a very small amount.  (Of course, they shouldn't
         exceed the boundaries at all, but sometimes they do by very small
         amounts.  Recent changes to the maximization routines ought to
         eliminate that.)

Shortcuts: perturb - perturb

Return to command index

A1.144 phenotypes


Purpose:  Load the phenotypes file or display its variables 

Usage:    load phenotypes [<filename>]+      ;#  sets phenotype file
          phenotypes                         ;#  returns phenotype names
                                             ;#    (and filenames)
          phenotypes -files                  ;#  returns filenames used
                                             ;#    (useful in scripts)

Notes:    (1) Pedigree data should be loaded beforehand with the 
              "load pedigree" command.  You may have pedigree and
              phenotypes data in the same file if it has all the
              required fields, but you will still have to load it
              as phenotypes after loading it as the pedigree.

          (2) The phenotypes file may be in Comma Delimited or PEDSYS format.
              There must be an ID field (and FAMID field, if ID's are not
              unique in all the data), and then there may be any number of
              phenotypes.  Sex is determined by the SEX field in the
              pedigree file and a SEX field in the phenotypes file is
              ignored.  If FAMID is present in both pedigree and
              phenotypes files, it is assumed to be necessary to make ID's
              unique.  If FAMID is not present in both files, uniqueness
              of ID's is tested during loading (since this is an often
              overlooked user error).  If FAMID is present in both files,
              uniqueness is not tested.  The fieldname EGO may be used
              in place of ID, and the "field" command may may be used to
              specify a different field name for ID.  For more discussion
              about the phenotypes file format, see file-phenotypes.

          (3) Once a phenotypes file is loaded in a particular working
              directory, it remains loaded until another phenotypes is
              loaded, even if SOLAR is restarted there at a later time.
              The current pedigree state is kept in a file named
              phenotypes.info, which points to the current pedigree file(s)
              by name.  When SOLAR starts, it check this file, and
              get the header from the phenotypes file(s) so that the
              phenotypes available are known.

          (4) During maximization, the pedigree data and phenotypes file
              are joined, so it is possible some errors will not be apparent
              until that time.

          (5) Individuals missing phenotypic data are removed from the
              maximization sample, and need not be included in the
              phenotypes file, however they should be included in the
              pedigree file as they may contribute to the pedigree
              structure and genetic coefficients of those individuals
              who are included.

          (6) Families in which no non-probands are present are removed
              from the maximization sample.  Proband status is controlled
              by a PROBND field in the phenotypes file.  To switch
              proband detection off, you may rename that field, or
              use the command "field proband -none".


Shortcuts: phen - phenotypes

Return to command index

A1.145 plink_converter


Purpose: Converts plink .bed, .fam, and .bim to .csv file
         
Usage: plink_converter -i <input base name> -o <output base name> optional:<-bin>
						<-max> <maximum snps per file> -perchromo

Example: plink_converter -i test -o test -bin -max 50000
         
           -bin Assigns values 0, 1, 2 to alllele combinations. A table with assignment
                valuesis provided.
           -max Sets the maximum number of snps per .csv file.  For example if an input set 
            contains 500 snps and the option -max 100 is used then five .csv files
            labeled <output_base_name>_<file number>.csv will be created.
     -perchromo Switch that separates output by chromosome.
	     -solar Outputs snp data as 1/1,1/2,or 2/2
    

Return to command index

A1.146 plink_freq


Purpose: Calculates the allele frequencies from a plink file

Usage: plink_freq -plink <base filename of plink file set>
                  -o <output file name> 

pedifromsnps and gpu_pedifromsnps require the output of this 
command.  The text file output consists of an initial value
that represents the number of loci in the file.  The subsequent
values are the frequency values for each locus.  If a value could
not be calculated for whatever reason, then -1 is written in place
of a frequency. 	

Return to command index

A1.147 plot


Purpose:  Plot multipoint LOD scores, empirical LOD adjustments, or power

Usage:    plot [<chromnum>] [-pass <passnum>] [-write]
                    [-color <colornum>] [-overlay]
                    [-title <title>] [-subtitle <subtitle>]
                    [-yscale <maxy>] [-map <user_map>] [-lodadj]
                    [-min x] [-max x] [-nomark] [-nomarklab]
                    [-all | -allpass [-nodisplay] [-nomini]]
                    [-string [-allpass] [-lod <lod>] [-lodmark] [-lodscale]
                      [-color <colorname>] [-noconv] [-date] [-name <name>]
                      [-font <X-font-spec>] [-titlefont <X-font-spec]
                      [-layers {{<layername> [-color <colorname>]} ... }
                      [-replay {{<layername> [-color <colorname>]} ... }
                      [-title <title>] [-dash 1/2] [-linestyle 1/2]
                    [-liability [-model <name>]]
                    [-power [-title <plot_title>]]

          plot -purge
          plot -close

Examples: plot                    plot chromosome with highest LOD in pass 1
          plot 9                  plot chromosome 9 in pass 1
          plot 9 -pass 2          plot chromosome 9 in pass 2
          plot -all               plot all chromosomes in pass 1
          plot -all -pass 2       plot all chromosomes in pass 2
          plot -allpass           plot all chromosomes in all passes
          plot -string            plot all chromosomes in pass 1 using 
                                    "string" plot format
          plot -string -allpass   plot all chromosomes in all passes using
                                    "string" plot format

If postscript output files are saved, they are written to the current
trait or outdir directory with names like these:

  chr01.ps              chromosome 1 (pass 1)
  chr01.pass02.ps       chromosome 1 (pass 2)
  pass01.ps             Miniplot of chromosomes in pass 1 (plot -all -pass 1)
  pass01.str.ps         String plot of pass 1


          chromnum  [1-29] Set chromosome number for plotting.  The default
                    is to plot chromosome with highest LOD score.

          -pass     Set multipoint pass number for plotting.  "1" would
                    mean the first pass in which all models have one
                    QTL.  1 is the default.

          -close  Close the XMGR plot window.  The miniplot and string plot
                  display windows must be closed with their close buttons,
                  but it is better if you close XMGR from the SOLAR
                  command line.  Otherwise, on your next plot, there will
                  be a delay until SOLAR determines that it cannot
                  communicate with the old XMGR session.  Then, it will
                  time-out and tell you to use the "tclgr close" command,
                  which does the same thing as "plot -close".

          -write    Write postscript output file for plot(s).  If there are
                    no other arguments and if a plot was done previously,
                    the output file for the previous plot is written.  
                    Miniplot and stringplot files are always written
                    by default.  For plots drawn using XMGR, you can
                    also choose to write the postscript file from the
                    XMGR graphical interface, which give you more options.
                    See note 8 below.

          -nomark     Do not show ticks or labels for markers.  (This works
                      for both regular and -string plots.)  Unless this
                      option is selected, there must be a mibddir
                      selection in the current directory so that SOLAR
                      can find the map files.

          -nomarklab  Do not show labels for markers (still show ticks).

          -title      Set plot title.  Title may be blanked with
                      -title "" or -title " ".  This is supported
                      by regular plots, string plots, and power
                      plots only.  Plots made through XMGR may also
                      have title set through graphical interface or
                      by editing .gr file such as multipoint.gr.

          -subtitle   Set plot subtitle.  Supported by regular
                      multipoint plots only.  Subtitle may be blanked
                      with -subtitle "" or -subtitle " ".

          -color  Use this color for curve (overrides multipoint.gr default)

                  For regular plots, this must be integer from 1-15; 
                  colors are defined by XMGR:

                  0:White 1:Black 2:Red 3:Green 4:Blue 5:Yellow
                  6:Brown 7:Gray 8:Violet 9:Cyan 10:Magenta 11:Orange
                  12:Indigo 13:Maroon 14:Turquoise 15:Green4

                  For string plots, the X11 color names are used.  Typical
                  color names are:

                  black white blue red green grey orange purple brown violet
                  magenta yellow cyan

                  Many mixtures and shades are also available.  Find the
                  rgb.txt file in your X11 installation for a complete list.

          -overlay  Plot this curve on top of the current graph, which may
                    already include more than one curve.  (Each curve
                    corresponds to a distinct XMGR set, of which 30 are
                    available in the custom version of XMGR used by SOLAR.
                    To control order of sets in Legend, use the -set
                    option for every plot.)

          -purge    Delete all previously created plotfiles (not valid with
                    other options; only valid for multipoint plots).

          -string     Plot all chromosomes (in pass 1 unless otherwise
                      specified) using "string plot" format.  (An
                      alternative page of plots in xmgr format can be
                      produced by with plot -all command.)

          -name       Name this plot for later use (-string plots only).

          -layers <layerlist>  Add one or more previous plots to this plot.
                      This is either a simple list of previous names, or a
                      nested list of names with other options, in either case
                      each element of <layerlist> specifies a single layer.
                      See extended example below under replay.
                     (-string plots only).

          -replay <layerlist>  Draw previous plots only, otherwise this is
                      the same as -layers.  (-string plots only) Example:

    trait q1
    plot -string -name A1
    trait q2
    plot -string -name A2 -layers {{A1 -color green}}
    trait q3
    plot -string -name A3 -layers {{A2 -color blue} {A1 -color green}}
    plot -string -replay {{A3 -color grey} {A2 -color blue} {A1 -color red}}
    plot -string -replay {A3 A2 A1}  ;# just default colors

                 Note that spaces between close and open braces, as
                 shown above, is required.

                 You can specify -color for the new top level plot and/or
                 for layers in the -layers or -replay list.  Any unspecified
                 colors will default to a built-in set of defaults.

          -lod lod    Add horizontal scales above this lodscore (for string
                        plot only)

          -noconv     Do not mark convergence errors (string plot only)

          -date       Datestamp (string plot only)

          -lodmark    Put marker ticks ON TOP of LOD curve (default is to the
                        left of the plot axis)  String plot only.

          -lodscale   Show the LOD scale when this LOD is exceeded (default
                      is for the scale only to appear for the highest LOD).
                      String plot only.

          -font       (String plot only!) Specify X font to use for title.
                      Use command "xlsfonts | more" to list X fonts.
                      Wildcards may be used, however, results are
                      sometimes unexpected.  For example *bold-r* will
                      match first bold,roman font in list, whatever
                      that happens to be.

          -dash 5/5   (String plot only!) Specify line style dot and
                       dash.  Two or more integers are specified separated
                       by slash.  The first and all odd position numbers
                       specify the length of drawn segments, whereas the
                       second and all even position numbers specify
                       the transparent segments.  Results are approximate
                       depending on system used.

          -linestyle 5/5  (String plot only!) Same as -dash (see above).

                       Note that for regular plots, linestyle can be
                       changed by editing the linestyle parameter in the
                       applicable .gr file such as multipoint.gr.

          -titlefont  (String plot only!) Same as -font, except applies
                      to title only.  Supercedes -font for title only.

          -all        Plot all chromosomes in xmgr postscript format (in
                      the first pass only unless -allpass specified). A page
                      of miniature chromosome plots in postscript is
                      created (if a python interpreter is available).  The
                      names of all postscript files are listed, and any
                      of them may be printed with the lp command.  Nothing
                      is displayed on your desktop with this command. An
                      alternative genome plot is available with
                      "plot -string".

          -allpass    Plot all chromosomes in all passes, producing either
                      miniplots or "string" plots (if -string).

          -nodisplay  Used with -all or -allpass to skip displaying 
                      miniplots on-screen (has no effect on xmgr graphs).

          -nomini     Used with -all or -allpass to skip making miniplots.
                      Automatically sets the "-write" option to write all
                      individual chromosome plots.  Miniplots can always
                      be made later, and with more options, using the 
                      separate "miniplot" command.

          -yscale  [NOTE: Ordinarily you do not need to use this.]
                   This sets the smallest LOD scaling if there is no
                   LOD above 4.99.  Autoscaling will not apply for smaller
                   values to prevent confusion (e.g. seeing what looks
                   like a large peak but isn't because the y scale is
                   is so small).  The default value is 4.99.  You can
                   set this to 0 if you really want to look at tiny LOD
                   curves.  Larger scaling is applied automatically, as
                   is the adjustment to allow space for marker labels.

          -map    Use user_map file (in user map format).  By default,
                  the map information processed by the 'load map' command
                  (and saved in the mibddir) is used to display marker
                  labels and locations.  However, you can substitute a user
                  map file (set file-map) by using this argument.  This 
                  allows you to delete unimportant markers and add QTL's of
                  particular interest.
               
          -min m  location at which to start plotting
          -max m  location at which to end plotting
                  (min and/or max may be used to restrict interval.
                   These apply to ordinary chromosome plots only.)

          -quick  Save time when plotting by re-using marker ticks and
                  labels from previous plot.  This option is used
                  automatically when you are plotting from the multipoint
                  command using the "-plot" argument.

          -lodadj  Plot empirical LOD adjustment scores.  None of the above
                   arguments except -close and -color are applicable in
                   this case.  The format file lodadj.gr is used instead
                   of multipoint.gr, but the rules are applied in the
                   same way (see notes below).

          -liability   Plot discrete trait liability function (a different
                       kind of plot, not related to above).  "polygenic"
                       command must have been run first, and the following
                       covariates must have been included:

                         sex age age*sex age^2 age^2*sex

                       The xmgr parameter file for -liability is liability.gr

          -model name  Specify a different modelname for the -liability
                       option.  There must be a companion maximization
                       output file (maximize -output name.out) so, for
                       example, there is name.mod and name.out.  The
                       default is poly (poly.mod and poly.out are created
                       by the polygenic command).

          -power  Plot power versus QTL heritability.  Only the -title
                  argument is applicable in this case.  The format file
                  power.gr is used instead of multipoint.gr.

Notes:

         1.  The trait or outdir must have previously been specified so
             the applicable multipoint file can be found.

         2.  Marker labels and ticks are taken from the mibdchrN.loc file
             (where N is the chromosome number) created from the user's
             map file during mibd file creation.  These files should be
             stored in the mibddir (and the mibddir should be specified
             before running plot).  If the mibdchrN.loc file(s) cannot
             be found, marker ticks and labels will not be shown.  In
             SOLAR releases 1.1.2-1.2.0 the 'load map' command will create
             a mibdchrN.loc file in the current directory.

             There will be a tick for each marker, and up to 42 marker
             labels will be displayed.  If there are more than 42
             markers, some labels will not be displayed.  Labels are
             prioritized based on LOD score and distance from nonzero
             value.  By plotting after the multipoint session has
             completed, one gets the best benefit from this label
             prioritization.  Marker ticks are always drawn vertically;
             add additional line (which might be diagonal) joins the
             label to its tick.

             You can eliminate the need for the map file by using the
             -nomark option.

         3.  XMGR (ACE/gr) is used for most plotting, using tclgr command.
             Each SOLAR process can have only one tclgr session open.  You
             can change the plot command used with the 'tclgr syscommand'
             command (it must be XMGR pipe command compatible).
             If SOLAR is exited before closing the plot session, the plot
             session may remain active (however, it may print a warning
             about not being able to access the named pipe).  If the user
             terminates the XMGR session through its graphical interface, 
             the command 'plot -close' must be given to reset it before
             another plot command can be given.

          4. The XMGR parameter setup file multipoint.gr is loaded.
             First, the multipoint.gr from SOLAR_LIB is loaded, then the
             multipoint.gr from ~/lib (if any), then the multipoint.gr
             from the working directory (if any).  You need only include
             the parameters you want to change in your local copy.

          5. When SOLAR exits, the XMGR session will be terminated.  If
             the automatic termination of XMGR should fail, the user
             should terminate XMGR anyway to prevent it from hogging CPU.
             (The custom XMGR in SOLAR_BIN prevents CPU hogging.)

          6. NaN's are indicated by X's on the curve.  Areas of the curve
             in between multiple X's may be invalid.  (NaN's are Not A Number
             which means maximization failed to arrive at a valid likelihood
             estimate.

          7. There are two additional options, -set and -graph, whose usage
             is discouraged except under exceptional circumstances.  They
             might force the set and graph numbers to specific values.
             By default, the set number is 1 (changed in version 1.6.0)
             except for overlays.  Overlays use the first available set
             number counting backwards from 29.  The graph number (except
             for overlays) is the same as the set number (overlays must use
             the preceding graph number).  Fooling with these can get you
             into trouble, but under difficult circumstances they might
             help.

         8.  Standard postscript landscape mode is used in all output files.
             If you want to choose any other output features, such as
             Encapsulated Postscript (EPS), portrait mode, etc., for
             those plots made by XMGR, you can open the "Printer Setup"
             dialog (under the "File" menu).  There you can select
             portrait output in a pulldown menu, check a "Generate EPS"
             check box, etc.  Then, to write the file, select the
             "File" option in the "Print to:" pulldown, and then press
             the "Print" button at the bottom of the dialog box.  You
             need not go to the separate "Print" option in the file menu,
             and sometimes it seems to work better to print directly
             from the Printer Setup dialog anyway.  All postscript files
             can be printed using "lp" command.  Displaying postscript
             or editing on screen depends on locally available software.

Shortcuts: plo - plotmulti

Return to command index

A1.148 plotqtld


Purpose:  Plot qtld (qtld.out)

Usage:    plotqtld <type> [-nolabels] [-nomarkers] [-file filename] [-local]

          <type> is one of: strat, mgeno, qtdt, qtld

          -nolabels     do not include "marker" labels (ticks only)
          -nomarkers    do not include marker ticks or labels
          -file         Use named file instead of qtldm.out in outdir
          -local        Ignore default plot parameters; use only local file

Notes:    You must select the trait or outdir first.

          The plot parameter file (in SOLAR_LIB) is qtld.gr.  You
          may override it with a copy in ~/lib or your working
          directory.  Your version need only include the parameters
          you would like to change.  This should work in most cases.
          If you specify -local, however, the qtld.gr in SOLAR_LIB
          is completely ignored, and your qtld.gr must be complete,
          which might get around some very obscure conflict between
          the two plot parameter files.

Return to command index

A1.149 plotqtn


Purpose:  Plot qtn marginal tests (qtnm.out)

Usage:    plotqtn [-nolabels] [-nomarkers] [-file filename] [-local]

          -nolabels     do not include "marker" labels (ticks only)
          -nomarkers    do not include marker ticks or labels
          -file         Use named file instead of qtnm.out in outdir
          -local        Ignore default plot parameters; use only local file

Notes:    You must select the trait or outdir first.  See qtnm for
          more information.  It must be possible to find the qtnm.out
          file in the outdir.

          The plot parameter file (in SOLAR_LIB) is qtn.gr.  You
          may override it with a copy in ~/lib or your working
          directory.  Your version need only include the parameters
          you would like to change.  This should work in most cases.
          If you specify -local, however, the qtn.gr in SOLAR_LIB
          is completely ignored, and your qtn.gr must be complete,
          which might get around some very obscure conflict between
          the two plot parameter files.

          plotqtn accepts either the original 4 or the new 5 column qtnm.out
          files.  The 5 column files begin with the snp name that is not
          necessarily the location.

Return to command index

A1.150 polyclass


Purpose:  Set up polygenic model with class specific parameterization

Usage:    polyclass [-g] [-intrait] [-incovar] [<class-start>[-<class-end>]]+
                    [-comb] [-maxi] [-rincovar] [-maxsnp <snp_name>]
          sporclass [-g] [-intrait] [-incovar] [<class-start>[-<class-end>]]+
                    [-comb] [-maxi] [-rincovar] [-maxsnp <snp_name>]

          -g   Use global phenotypic values to set parameter adjustments
               (otherwise, means are determined for each class)

          -intrait  inormalize trait values on a per-class basis
          -resmax inormalize residual values in place of traits
          -incovar  (NOT WORKING IN version 7.1.2) inormalize covar values
                    on a per-class basis (only used for simple linear
                    covariates, no interactions or exponents)
          -comb     all classes combined model
          -max      after building the model, maximize it

          -maxsnp <snp_name>  Maximize and include  snp_name as covariate
                       in the model and determine statistics for it: beta,
                       beta se, chi, p, and variance explained (varexp).  
                       H2r's are reported for the models with and
                       without the snp.
          -append   Append results to existing output file(s) if any

Short Example:

           trait q4
           covariate age sex
           polyclass 1-3 9
           maximize -q

Notes: One phenotypes file must have a field named "class" which defines
       the class value for each person in the sample.

       Class specific parameters are given names with _c<class> appended.

       User covariates are transformed into class-specific mu addends.
       All individuals in sample must have all variables specified as
       covariates.

       After choosing trait and covariates, do either sporclass or
       polyclass.  You cannot do a second polyclass on a sporclassed model
       to make it polygenic.

       Unbalanced covariates for multivariate traits are not supported.
       This is different from ordinary covariate behavior for multivariate
       traits--which permits covariates to be missing in the sample if they
       are specific to a missing trait.

       A defined pseudo-covariate named "blank_classes()" restricts the
       sample to the union of all classes specified.

       The maximized model is asved in the output directory as
       polyclassmax.mod with output file polyclassmax.out.  Note that if
       -intrait option is selected, trait name and default output
       directory will have leading i_ prefix (for the inormalization).
       If the -resmax option is selected, the trait will be named
       "residual" or "i_residual" if -intrait is also selected.


Return to command index

A1.151 polyclass_normalize


Purpose:  Runs sporadic model and inormal on a phenotype 

Usage:    polyclass_normalize -out <output filename>  
           optional: -class <class values seperated by comma>
          
          Example:
            load phenotypes <phenotypes file>
            covariates <covariate list>
            trait  <trait to be analyized>
            polyclass_normalize  -class 1,2,3 -out polyclass_normalize_output.csv
            


  Polyclass normalization function will perform normalization of datasets collected 
across different studies in preparation for mega-analysis. See Jahanshad an 
Kochunov Neuroimage. 2014 Apr 15;90:470-1. for details.  In short, the data for 
each dataset should be coded by class variable with one class value (0, 1, 2, 3 ..) 
codding the individual datasets. The polyclass normalization will perform 
regression of the covarariates for each dataset and then inverse Gaussian 
normalization of the residuals. The outputs will be written in the file 
that can specified by -out argument.

Return to command index

A1.152 polygenic


Purpose:  Perform polygenic, sporadic, and/or household analysis
            Calculate H2r, significance of H2r, and proportion of variance
              contributed by covariates.
            Optionally performs covariate screening (determine significance
              level of each covariate).

Usage:   polygenic [-screen] [-all] [-p | -prob <p>] [-fix <covar>]
                   [-testcovar <covar>] [-testrhoe] [-testrhog] [-testrhoc]
                   [-sporadic] [-keephouse] [-testrhop] [-rhopse] [-fphi]

         (screencov is an alias for 'polygenic -screen')
         (sporadic is an alias for 'polygenic -sporadic')

         Typically before giving this command, you will give trait,
         covariate, and house (if applicable) commands.  You will also load
         pedigree and phenotypes files if they have not already been loaded.

             solar> load pedigree ped
             solar> load phenotypes phen
             solar> trait hbp
             solar> covariate age sex age*sex smoke
             solar> polygenic -screen

         Alternatively, you may use the "automodel" command first to
         include all available phenotypes as covariates.  See note 2
         below and "help automodel".

         -screen   (or -s)  Perform covariate screening:
                   Calculate significance level for each covariate, and run
                   only the significant covariates in the final analysis.
                   An inclusive significance threshold of 0.1 is used,
                   but may be changed with the -prob option.  Covariates
                   may be locked in regardless of significance with the
                   -fix or -all options.

         (An alternative method of covariate analysis using bayesian
          model averaging is available with the command:
              bayesavg -covariates)

         -p        (or -prob)  p is the probability level for keeping
                   covariates as "significant."  The default is 0.1.
                   It is set to be generous so that covariates are not
                   removed unnecessarily.  (The probability levels for
                   H2r and C2 are fixed at 0.05, however, H2r is never
                   removed from the final model even if it judged to
                   be not significant, and C2 is only removed from the
                   model if it is zero in the final model and therefore
                   has no effect at all.)

         -fix      (or -f) "fix" (lock in) this particular covariate
                   regardless of significance level.  NOTE: a -fix or -f
                   qualifier is required for each covariate to be fixed,
                   for example:  -f age -f sex

         -all      (or -a) Keep all covariates in final anaysis regardless
                   of significance level.

         -testcovar <covar>  Test the probability of this covariate only.
                    All other covariates are fixed and unscreened.  This
                    argument is incompatible with -screen (screen all
                    covariates).  The tested covariate is not removed from
                    final model regardless of probability.  For -testcovar,
                    the default probability level for declared
                    "significance" is 0.05 (which can be changed with -p
                    option).  Also, the reported proportion of variance
                    is for the tested covariate only.

         -testrhoe  (Bivariate only)  Test significance of rhoe difference
                    from 0 by running model where rhoe is constrained to 0.
                    The p value is shown in the same line as the RhoE value.

         -testrhog  (Bivariate only)  Test significance of rhog differences
                    from zero and from 1 (if positive) or -1 (if negative).
                    Because there may be two p values, they are shown
                    in line(s) below the RhoG result and standard error.

         -testrhoc  (Bivariate Household only) Test significance of rhoc
                    differences from zero and 1 (if positive) and -1 (if
                    negative).  Because there may be two p values, they are
                    shown in line(s) below the RhoC result and std. error.

         -testrhop  (Bivariate polygenic only) Test significance of derived
                    estimate of phenotypic correlation differences
                    (difference from 0).

         -rhopse     (-testrhop must be specified also) Get standard error
                     of rhop, saved in model file rhop.mod and variable
                     SOLAR_RhoP_SE

         -sporadic  Only evaluate sporadic models, not polygenic.

         -keephouse Keep "household effect" C2 parameter in final model
                    even if it maximizes to zero in the best polygenic
                    (or sporadic) model.

         -residinor After maximizing final sporadic model (after covariate
                    testing, if that is done), residualize the final model
                    and inormalize the residual trait.  (Warning!  The
                    phenotypes file loaded at the end of analysis will be
                    the residual phenotypes file.)
           
          -fphi     Option to run polygenic using fphi function.

Notes:    (1) Output is written to directory selected by 'outdir' command,
          or, if none is selected, to a directory named by the trait.  This
          is called the "maximization output directory."  Polygenic results
          are in file named polygenic.out.  Important loglikelihoods and
          statistical computations are recorded in polygenic.out.logs.  If
          the -sporadic option is selected, the files are sporadic.out and
          sporadic.out.logs.  For univariate models, the residuals are
          computed and written to a file named polygenic.residuals (or
          sporadic.residuals), then the statistics of those residuals
          are written to a file named polygenic.residuals.stats (or
          sporadic.residuals.stats).  If the residual kurtosis is
          above 0.8, you get a special warning (see note 5 below).  You
          also get a special warning if the trait standard deviation is
          below 0.5, which is undesireable for numerical reasons.

          (2) Prior to running polygenic, you should set up the trait and
          covariates.  You may use the trait and covariate commands, or
          use the "automodel" command. "automodel" selects all variables
          otherwise unaccounted for in the phenotypes file as candidate
          covariates, and also sex and the standard interactions with
          sex and age.  (If you are unfamiliar with "automodel" it would
          be a good idea to examine the covariates afterwards with the
          covariates command...)

          (3) Beginning with Version 8.3.1, the phi2.gz matrix is
          loaded during maximization if phi2 is not already loaded
          by the user.  That was already being done for many kinds
          of maximization, just not the simplest univariate
          quantitative.  The AutoPhi2 modeltype brings back the
          original on-the-fly calculation of phi2, if no phi2.gz had
          been loaded.  That is often not what is intended, but
          sometimes people who had not saved their phi2.gz were
          relying on it.

          (4) If household effect (see "house") is in effect when the
          polygenic command is given, it will be included in the analysis.
          If the household parameter C2 is 0 in the household polygenic
          model, it will be removed from the final model regardless of
          whether "covariate screening" is performed, unless -keephouse
          is specified.  The p value for C2 will be computed (if C2 is
          nonzero), but the p value will not cause C2 to be removed from
          the final model. The p value of the C2 parameters is not
          computed for bivariate models.

          (5) If any covariates have been constrained by the user,
          certain tests are not allowed: the determination of total
          variance due to covariates, or the Leibler-Kullback R
          squared (done for discrete traits).  Also, such covariates
          are not included in the "screening" if the screening option
          is selected.

          (6) If you get the message about Residual Kurtosis being too high
          because it is above 0.8, there is danger of LOD scores  being
          estimated too high in a subsequent linkage analysis.  You should
          start over using either tdist or lodadj or inormal (see 
          documentation) to protect against this.  If you are already
          using tdist or lodadj, you may ignore this warning, but it would
          be fair to report both the Residual Kurtosis and the method
          you are using to deal with it.  We most strongly recommend
          inormal, which in conjunction with the define command creates
          an inverse normalized transformation of your trait(s).

          If there are no covariates, the Kurtosis is computed from the
          trait itself, and no "residuals" are computed.  The same warning
          threshold applies.  We define Kurtosis as 0 for a standard
          normal distribution; 3 has already been subtracted from the
          normalized 4th central moment.

          (7) The polygenic command only supports our "standard"
          parameterizations.  If you would like to use the esd,gsd,qsd
          parameterization, use the polygsd command (see "help polygsd"
          for more information) instead.

          (8) For bivariate polygenic models only, a derived estimate of
          RhoP, the phenotypic correlation, is displayed on terminal
          and written to polygenic.out.  This estimate is computed from the
          h2r's, rhog, and rhoe according to the following formula:

              sqrt(h2r(ti))*sqrt(h2r(tj))*rhog + 
                  sqrt(1-h2r(ti))*sqrt(1-h2r(tj))*rhoe

          To determine the significance of RhoP by comparing models with
          a rhop parameter and a rhop parameter constrained to zero, use
          the -testrhop option.  Additional models rhop.mod and rhop0.mod
          are written to the output directory.

          (9) The polygenic command creates global variables which may
          be accessed later (which is often useful in scripts).  The
          variables are:

              SOLAR_Individuals  number of individuals included in sample
              SOLAR_H2r_P        p value for h2r
              SOLAR_Kurtosis     residual trait kurtosis
              SOLAR_Covlist_P    list of p values for covariates
              SOLAR_Covlist_Chi  list of chi values for covariates
              SOLAR_RhoP         derived estimate of phenotypic correlation
                                   for bivariate polygenic models, {} if
                                   not calculated
              SOLAR_RhoP_P       -testrhop sets this to p value of rhop
                                   being nonzero
              SOLAR_RhoP_SE      -rhopse sets this to se value of rhop
              SOLAR_RhoP_OK      -testrhop sets this if likelihood of rhop
                                   parameterized model matches polygenic,
                                   as it should

          The covariate lists are created only if the -screen option
          is used.  All screened variables are included, regardless of
          whether they were retained in the final model.  Before you
          can access any of these variables in a script, you must
          use a "global" command.  For example:

              global SOLAR_Kurtosis
              if {$SOLAR_Kurtosis > 4} {puts "Very bad kurtosis!"}

          (10) The default is for the standard error option to be turned
          on (and temporarily off, when desireable for certain tests).
          However, if you turn the standard error option off before
          starting polygenic, it will remain off.
          

Shortcuts: polyg - polygenic

Return to command index

A1.153 polygsd


Purpose:  Set up polygenic model esd and gsd parameters (EXPERIMENTAL)

Usage:    polygsd

Note:     "model new" and "trait" commands should be given first.
          After polygsd, you should use "maximize" command.

          Use the gsd2h2r command to convert resulting esd,gsd parameters
          to h2r value.

          Use the linkqsd command to add in linkage element afterwards.

Example:  model new
          trait q4
          covar age sex
          polygsd
          maximize
          linkqsd gaw10mibd/mibd.9.18.gz  ;# could maximize after this
          chromosome 9 10
          interval 5
          mibddir gaw10mibd
          multipoint -link linkqsd0 -cparm {esd gsd qsd}


Return to command index

A1.154 polymod


Purpose:  Set up polygenic model with the standard parameters

Usage:    polymod [-d]

IMPORTANT::  Phenotypes, trait, and covariate commands must be
             given beforehand.

          -d  Check for discrete trait(s) and make necessary changes.
              In most cases, this option is not necessary because
              "maximize" later checks for discrete traits and can also
              make these changes: constraining SD to 1 and making
              sure phi2 matrix is loaded, for each discrete trait.
              However, use of -d option can make the constraint or matrix
              order inside complex models easier to deal with.

Notes:    The starting lower bound for e2 is controlled by e2lower.

          Normally you do not use this command directly, but instead use 
          the "polygenic" command to do a complete polygenic analysis,
          which maximizes a polygenic model which was set up using this
          command.  See the tutorial in Chapter 3.

          polymod will modify an existing sporadic or linkage model
          to change it to polygenic.  Use spormod to set up a
          sporadic model, and linkmod to set up a linkage model.
          None of these commands maximize models, they just set up
          or modify the parameters and omega as required.
          
          This command removes a house parameter (if present) from
          the omega, since a "polygenic" model is distinct from a
          "household polygenic" model.  If you want the latter, call
          polymod first, then house.  Or call house, THEN polygenic,
          since the polygenic command will check for and handle household
          effect properly.

Shortcuts: polym - polymodel

Return to command index

A1.155 polyvoxel


Purpose: polygenic analysis of image data

Usage: trait ...
       covariate ...
       polyvoxel <maskname> <outname>

<maskname> the filename of mask image file to use
<outname> the filename of image file to write as output

Notes:  Trait and covariate should be selected first.  The number of
layers in the output image will be adjusted to match the number of
traits and covariates

Return to command index

A1.156 power


Purpose:  Perform power calculations (or MathMatrix power operation)

Usage:    power [-prev] [-h2t <h2t>] [-h2r <h2r>] [-data <fieldname>]
                [-grid {<from> <to> <incr>}] [-lod {<lod> ...}]
                [-freq <freq>] [-nreps <nreps>] [-seed <seed>]
                [-overwrite] [-plot]

          power -restart [-grid {<from> <to> <incr>}]
                [-lod {<lod> ...}] [-nreps <nreps>] [-plot]

          power <matrix> <integer-power>  ;# see help mathmatrix

          This command performs a power calculation for the currently
          loaded pedigree, with the following assumptions:

              (1) the trait to be studied is either quantitative or
                  dichotomous (e.g. affected/unaffected)

              (2) the trait to be studied is influenced by a single
                  bi-allelic QTL with, optionally, a residual additive
                  genetic effect due to polygenes

              (3) there will be fully informative marker genotype data
                  available for all study subjects

              (4) all study subjects will be phenotyped for the trait
                  to be studied (unless the -data option is used to
                  exclude those individuals who will not have phenotypic
                  data; see the description of this option below)

          Simulation is used to estimate the LOD score one would expect
          to obtain for a QTL having a certain effect size (i.e. QTL
          heritability).  The expected LOD is calculated for a range of
          effect sizes.  The ELODs, in turn, are used to compute the power
          to detect a QTL having these various effect sizes with, for
          example, a LOD of 3.

          The default is to perform 10 replicates of the simulation for
          each effect size in the range .01, .02, .03, ..., .99.  For
          each replicate, both a polygenic and a linkage model are fitted
          to the simulated data and then compared.  The resulting QTL
          heritability estimate and LOD score are recorded.  The observed
          LODs are converted to power, i.e. the power to detect the
          corresponding observed effect size with a specified LOD.

          The following options give the user some control over the power
          calculation procedure:

              -prev     If the trait to be studied is dichotomous, SOLAR
                        will assume the existence of an unobserved liability
                        distribution. Individuals with liabilities above
                        some threshold value will be "affected", i.e. they
                        will have the larger of the two trait values (for
                        example, a 1 for a 0/1 trait.) The -prev option
                        is used to specify the "disease" prevalence, or
                        fraction of individuals who are "affected", which
                        in turn determines the liability threshold.

              -grid     Specify the set of effect sizes for which ELODs
                        will be computed. The grid is given by a set of
                        three numbers enclosed in curly braces:

                            {<from> <to> <incr>}

                        where <from> is the starting effect size, <to>
                        is the last effect size considered, and <incr>
                        is the interval between grid points.  If the
                        desired grid consists of a single effect size,
                        the three-number list can be replaced by that
                        single number and curly braces are not required.
                        The default grid is from 0.05 through 0.5 by
                        steps of 0.05.

              -h2r      At each grid point, add a constant residual
                        additive genetic heritability <h2r> to the
                        QTL-specific heritability.

              -h2t      Set the residual heritability so that the total
                        heritability (QTL plus residual) is equal to a
                        constant value <h2t>.

              -data     Exclude individuals from the power calculation
                        who are missing data for phenotype <fieldname>.

              -lod      Specify the set of LODs for which power will be
                        computed.  If more than one LOD is specified, the
                        set of numbers must be enclosed in curly braces.
                        The default set of LODs is { 3 2 }.  The order of
                        the LODs is important since it is reflected in
                        the output file power.out (see below).  The set
                        of LODs can also be changed for a completed power
                        calculation by using the -lod option in conjunction
                        with the -restart option.

              -freq     Specify the frequency of the first of the two
                        alleles assumed to exist for the QTL.  The default
                        allele frequency is 0.2113; this frequency results
                        in the simulated trait having kurtosis = 0.

              -nreps    Perform <nreps> simulations at each grid point.
                        The default number of replicates is 100.

              -seed     Set the random number generator seed.  The default
                        is to set the seed based on the date and time.

              -plot     At the end of the power calculations, display a
                        plot of power versus QTL heritability.  To display
                        this plot for a previously completed calculation,
                        use the command "plot -power".

              -overwrite (or -ov)  Overwrite the results of a previous
                                   power calculation.

              -restart (or -r)     Restart a power calculation.


Notes:    It is possible to change the grid of effect sizes and the number
          of replicates when restarting a calculation.  The calculation
          will not be restarted if a grid is chosen that does not include
          all the points in the previously specified grid unless the
          -overwrite option is included, in which case the simulation
          replicates for any extra grid points are discarded.  Similarly,
          the -overwrite option is necessary if fewer replicates are
          requested than were done previously, in which case any extra
          replicates are discarded.  The set of LODs for which power
          estimates are computed can also be changed in a restart.  The
          other parameters, e.g. h2t, cannot be changed and are kept the
          same as in the original run, with the exception of the seed for
          the random number generator which is set based on the date and
          time.

          The plot of power versus QTL heritability is derived from a
          smoothed estimate of the ELODs.  Smoothing is achieved with a
          least-squares fit of a second degree polynomial to the ELODs as
          a function of QTL heritability.  It is important to have a
          sufficiently large number of replicates to produce a reasonable
          curve fit.  The default of 100 replicates should suffice in most
          cases.  To compute power as a function of the unsmoothed ELODs,
          include the -nosmooth option.

          The following files are created:

              power.out    A space-delimited file containing a line for
                           each grid point in the format X Y1 Y2 ..., which
                           is suitable for input to plotting packages such
                           as xmgr.  The first (or X) column contains the
                           QTL heritability.  The succeeding columns hold
                           the power estimates, each corresponding to a
                           different LOD.  These columns are in the order
                           given by the -lod option.

              power.info   Stores the various options selected along with
                           the ELODs, averaged over the replicates, at each
                           grid point.

              power.lods   Stores the results of the simulation replicates
                           run at each grid point.  This file, along with
                           power.info, is used to restart an interrupted
                           power calculation.

           During a power calculation, various files named "simqtl.*" are
           created along with a trait directory named "simqt". These will
           be removed at the end of the run.

Return to command index

A1.157 putsa


Purpose: Write to file and/or terminal without having to open and close

Usage: putsnew <filename>            ;# create new file or truncate old file
       putsa <filename> <string>     ;# append to file
       putsat <filename> <string>    ;# append to file and write to terminal

Notes:

1.  putsnew creates the file if not yet created, or truncates existing
    file to zero length.  This is generally needed before starting to
    append to file using putsa unless you know the file already exists.

2.  These procedures are NOT as efficient as Tcl's built-in operations
    for writing to a file (open, puts, close).  If efficiency is
    paramount, and you have a tight writing loop not subject to
    exceptions or with fully handled exceptions, it is preferable to
    use the Tcl operations.  However these procedures are believed to
    be as efficient as shell append piping (>>).  These procedures are
    considerably more efficient than the now obsolescent "putsout".

3.  putsa writes (appends) to the existing file

4.  putsat writes (appends) to the existing file, and to the terminal

5.  If an absolute filename is not specified, the file path is relative to
    the current working directory.  If you need to write to a file in the
    maximization output directory, use the procedure full_filename first
    to produce the required pathname.  Beware that the pathname produced
    by full_filename is also relative to the current working directory, so
    that if you change the current working directory, or the trait, or the
    outdir, you should run full_filename again, if you want to continue
    writing to a file in the current maximization directory (though it
    will then be a different directory and a different file).

 Example:

      # writing to file in maximization output directory
      set myfullname [full_filename myname.out]
      putsnew $myfullname
      putsat $myfullname "Writing to $myfullname"
      putsa $myfullname "writing more, but not to terminal this time"
      trait newtrait
      set myfullname [full_filename myname.out]
      putsnew $myfullname
      putsat $myfullname "Now writing to $myfullname"

Return to command index

A1.158 putsat


solar::putsa --

Purpose: Write to file and/or terminal without having to open and close

Usage: putsnew <filename>            ;# create new file or truncate old file
       putsa <filename> <string>     ;# append to file
       putsat <filename> <string>    ;# append to file and write to terminal

Notes:

1.  putsnew creates the file if not yet created, or truncates existing
    file to zero length.  This is generally needed before starting to
    append to file using putsa unless you know the file already exists.

2.  These procedures are NOT as efficient as Tcl's built-in operations
    for writing to a file (open, puts, close).  If efficiency is
    paramount, and you have a tight writing loop not subject to
    exceptions or with fully handled exceptions, it is preferable to
    use the Tcl operations.  However these procedures are believed to
    be as efficient as shell append piping (>>).  These procedures are
    considerably more efficient than the now obsolescent "putsout".

3.  putsa writes (appends) to the existing file

4.  putsat writes (appends) to the existing file, and to the terminal

5.  If an absolute filename is not specified, the file path is relative to
    the current working directory.  If you need to write to a file in the
    maximization output directory, use the procedure full_filename first
    to produce the required pathname.  Beware that the pathname produced
    by full_filename is also relative to the current working directory, so
    that if you change the current working directory, or the trait, or the
    outdir, you should run full_filename again, if you want to continue
    writing to a file in the current maximization directory (though it
    will then be a different directory and a different file).

 Example:

      # writing to file in maximization output directory
      set myfullname [full_filename myname.out]
      putsnew $myfullname
      putsat $myfullname "Writing to $myfullname"
      putsa $myfullname "writing more, but not to terminal this time"
      trait newtrait
      set myfullname [full_filename myname.out]
      putsnew $myfullname
      putsat $myfullname "Now writing to $myfullname"

Return to command index

A1.159 putsnew


solar::putsat --
solar::putsa --

Purpose: Write to file and/or terminal without having to open and close

Usage: putsnew <filename>            ;# create new file or truncate old file
       putsa <filename> <string>     ;# append to file
       putsat <filename> <string>    ;# append to file and write to terminal

Notes:

1.  putsnew creates the file if not yet created, or truncates existing
    file to zero length.  This is generally needed before starting to
    append to file using putsa unless you know the file already exists.

2.  These procedures are NOT as efficient as Tcl's built-in operations
    for writing to a file (open, puts, close).  If efficiency is
    paramount, and you have a tight writing loop not subject to
    exceptions or with fully handled exceptions, it is preferable to
    use the Tcl operations.  However these procedures are believed to
    be as efficient as shell append piping (>>).  These procedures are
    considerably more efficient than the now obsolescent "putsout".

3.  putsa writes (appends) to the existing file

4.  putsat writes (appends) to the existing file, and to the terminal

5.  If an absolute filename is not specified, the file path is relative to
    the current working directory.  If you need to write to a file in the
    maximization output directory, use the procedure full_filename first
    to produce the required pathname.  Beware that the pathname produced
    by full_filename is also relative to the current working directory, so
    that if you change the current working directory, or the trait, or the
    outdir, you should run full_filename again, if you want to continue
    writing to a file in the current maximization directory (though it
    will then be a different directory and a different file).

 Example:

      # writing to file in maximization output directory
      set myfullname [full_filename myname.out]
      putsnew $myfullname
      putsat $myfullname "Writing to $myfullname"
      putsa $myfullname "writing more, but not to terminal this time"
      trait newtrait
      set myfullname [full_filename myname.out]
      putsnew $myfullname
      putsat $myfullname "Now writing to $myfullname"

Return to command index

A1.160 putsout


Purpose:  Write message to terminal and/or file [obsolescent]

NOTE: This is inefficient and no longer recommended.  New code should
      use putsa to append to a file and putsat to append to file
      and write to terminal.  See help putso.

Usage:    putsout [-q] [-d.] [-nonewline] <filename> <message>

          -q            No output to terminal
          -d.           Write file to current directory
          -nonewline    As with puts command (note: may delay output)
          <filename>    *name* of file in current output directory (outdir)
          <message>     string

Simple Example:  putsout mine.out "The result was $result"


Advanced Example: (Beginners ignore!)

           set q ""
           ifverbmax set q "-q"
           eval putsout $q \"Iteration: $i   Value: $value\"

Note: If using a variable for -q which could be "", be sure to use
      eval, otherwise "" would be considered filename argument, then
      remember to \ the quotes or they disappear during the eval.

Return to command index

A1.161 qtld


Purpose:  Association analysis for snps

Usage:    qtld

Notes:    Current model is used as starting point.  It is saved in output
          directory as qtld.start.mod with standard errors turned off.

          Snp association phenotypes are prefixed by b_ w_ b2_ and w2_ and
          are taken from the currently loaded phenotypes files.  If there
          is one matching phenotype, the other 3 are expected, and it is an
          error if any are missing.

          Main output is written to terminal and file qtld.out in the
          output directory.  An additional file with detailed measured
          genotype information is written to mgeno.out in output directory.
          These output files are fixed column size and space delimited.

          The output fields in the mgeno.out file (same as in the terminal
          output) are:

              Trait, SNP, Stratification, Measured Genotype, QTDT, QTLD

          The output fields in the mgeno.out file (also listed at the top
          of the file) are:

               Trait, SNP, p(mg), h2m, muAA, se(muAA), muAB, se(muAB),
                   muBB, se(muBB)

Return to command index

A1.162 qtnm


Purpose:  Marginal tests for bayesavg -qtn

Usage:    [allsnp]
          bayesavg -qtn -max 1
          [load map snp.map]
          qtnm [-noplot] [-nomap]

          -noplot  Do not plot results
          -nomap   Do not use map file; SNP locations are encoded in names

Notes:    You must do bayesavg -qtn [-max 1] first, then qtnm.  qtnm
          writes a datafile qtnm.out in the outdir, then invokes
          plotqtn to plot it.  (The -max 1 is optional, however,
          if you want to do this quickly, you had best include it.)

          To include all snps as covariates in the starting model, use
          the separate command "allsnp".

          SNP covariate names (after the snp_ or hap_ prefix) will be
          mapped to locations using the currently loaded map file,
          which must be loaded prior to running qtnm.  Map files stay
          loaded from one solar session to the next (in the same working
          directory) so once you have loaded it, you do not need to load
          it again.

          Beginning with version 3.0.3, snp names will always be mapped
          to locations using a loaded map file.  However, you can revert
          to the previous method, in which the locations are encoded into
          the snp "names" using the -nomap option.

          Beginning with SOLAR version 3.0.2, the qtnm.out file
          has the following 5 columns:

          SNP Name (or location if numeric)
          SNP location
          Chi Squared
          p
          log(p)

         Previously there was no "SNP Name" column because it was
         assumed to be the location.  Note that plotqtn accepts
         qtnm.out files with either 4 or 5 columns.

Shortcuts: qtn - qtnmarker

Return to command index

A1.163 quadratic


Purpose:  Get the most recent quadratic form after a maximization

Usage:    quadratic

Note:     This could be used in a TCL script like this:

          set firstq [quadratic]



Shortcuts: quadrat - quadratic

Return to command index

A1.164 read_arglist


Purpose:  Read hyphenated optional arguments and argument-value pairs

Usage:    read_arglist arglist [identifier_name value_var]+
          value_var := varname | {statement block}

Notes:
         This provides a general way of handling argument lists of the
         form:

         [[arg] | [-identifier [value]]]+

         Which is to say that there may be any number of "regular"
         arguments and "hyphenated" arguments.  The "hyphenated"
         arguments may be stand-alone or paired with values.  (Unlike
         typical Unix syntax, stand-alone hyphenated arguments MAY NOT be
         strung together, and hyphenated arguments with values must be
         separated by space and not with some other character such as =).

         The "regular" arguments (those not hyphenated or part of
         a hyphenated pair) are put into a list which is returned by
         this procedure.

         Hyphenated arguments may either require following "value"
         arguments or not allow them (in which case the hyphenated
         argument acts like a switch).  Value arguments must be separated
         from the hyphenated argument by a space (as is typical in Tcl).
         For example

             bar -height 1.5

         There are two ways read_arglist can handle a hyphenated argument.

         (1) The first, specified by the 'varname' expansion of value_var,
         performs an assignment of the "value" argument to the caller's
         varname variable.  For example:

             read_arglist $args -height h

         If $args contains "-height 1.5", then 1.5 will be assigned to the
         caller's 'h' variable.  Note that this method always requires
         a value argument and so does not work for switch arguments.

         (2) The second, specified by the '{statement block}' expansion
         of value_var executes an arbitrary set of expressions in
         the caller's context.  This allows a simple switch or more
         complex list-building.  The the statement block contains the
         token VALUE, a value argument is required and the token
         VALUE is replaced by the actual value argument.  Substitution
         is performed only once and only for the first occurance of
         VALUE.

         A simple switch is implemented like this:

             read_arglist $args -bottom {set bottom 1}

         If $args contains "-bottom," bottom (in the caller's context) is
         set to 1.  A value argument is neither required nor allowed.

         A list-building argument is implemented like this:

             read_arglist $args -include {lappend ilist VALUE}

         If $args contains "-include foo" then "lappend ilist foo" is
         executed in the caller's context.

         NOTE that in the {statement block} form, the statement block
         IS REQUIRED to have more than one list element.  A llength is
         done to determine which form is being used.  Thus, you cannot
         have:

             read_arglist $args -exit {exit}  ;# *** BAD ***

         but you could have

             read_arglist $args -exit {eval exit}

         If -* is used as an identifier_value, it matches any argument
         in the argument list and causes that argument do be added to
         the return list.  Normally -* should be the last identifier
         value; all following identifier values will be ignored.
         Also, the value_var (or statement block) following -* is never
         assigned or executed and so can be anything.  This is intended
         as an escape to permit only certain arguments to be processed
         leaving other variables for processing by a later procedure.

   More notes:

         It is the responsibility of the caller to assign default
         values before calling this procedure.

         Hyphenated arguments may not have hyphenated strings for values.
         However, hyphenated arguments may have negative numbers (e.g.
         -1.2e5) for values.  If the value does not parse as an integer
         or floating point number, it must not begin with hyphen.  If
         the token following a hyphenated argument begins with - but is
         not a number, it is considered to be another hyphenated argument
         (which may cause the preceding argument to be flagged as having
         a missing value).

         Hyphenated argument names must not be numbers (integer or floating
         point).  For example, you may not have "-1" or "-1.2e5" as a
         hyphenated argument.
         
         Hyphenated arguments which do not match any of the templates
         given raise the "Invalid argument %s".

         The identifier matching rule is case insensitive.

Return to command index

A1.165 read_model


Purpose:  Read a parameter value or likelihood from any saved model

Usage:    read_model <model-name> loglike             ; returns loglikelihood
          read_model <model-name> <parameter>         ; returns mle value
          read_model <model-name> <parameter> -se     ; standard error
          read_model <model-name> <parameter> -lower  ; lower bound
          read_model <model-name> <parameter> -upper  ; upper bound
          read_model <model-name> <parameter> -score  ; score

Model is read from current maximization output directory (see
help outdir).

Example:

           trait q4
           read_model null0 h2r

Return to command index

A1.166 read_output


Purpose:  Read variable statistics from maximization output file

Usage:    read_output <outfile> <varname> [-mean | -min | -max | -std]

          -mean Get variable mean (default)
          -min  Get variable minimum
          -max  Get variable maximum
          -std  Get variable standard deviation
          -d    1 if discrete, 0 otherwise

Note:     If outfile is not full pathname, current trait/outdir is assumed.
          Statistics pertain to actual sample used in maximization.

Example:  read_output null1.out q4 -std

Return to command index

A1.167 register


Purpose:  Create registration key file

Usage:    register <key>

Notes:    This creates a file ~/.solar_reg containing the key.  Do
          not delete this file.  You may copy this file to your
          home directory on other systems to run SOLAR if the
          same username is used.  (Each key matches only one
          username.)

          To obtain a key, please send a request to solar@txbiomedgenetics.org.
          specifing the username(s) under which you will be using the
          program, the email addresses of the users, and the NIH grant
          numbers (if any) that apply to the work for which SOLAR may
          be used.

Shortcuts: regi - register

Return to command index

A1.168 relatives


Purpose:  Show relationships of relative pairs included in analysis
            (having all required variables)

Usage:    relatives [-meanf]
                     -meanf causes Mean f to be reported
          relpairs             ;# alias for "relatives -meanf"


Notes:    output is returned (displayed) and also written to file named
          relatives.out in current trait/outdir.

          Uses previously stored null0 model in current trait/outdir.
          Polygenic command should have been run previously to create
          null0 model.

Shortcuts: relat - relatives

Return to command index

A1.169 relpairs


solar::relatives --

Purpose:  Show relationships of relative pairs included in analysis
            (having all required variables)

Usage:    relatives [-meanf]
                     -meanf causes Mean f to be reported
          relpairs             ;# alias for "relatives -meanf"


Notes:    output is returned (displayed) and also written to file named
          relatives.out in current trait/outdir.

          Uses previously stored null0 model in current trait/outdir.
          Polygenic command should have been run previously to create
          null0 model.

Return to command index

A1.170 remlist


Purpose:  Remove element from list by name

Usage:    remlist <list> <element>

Notes:    Input list is not modified, but new list is returned.

          Only first matching element is removed.  This works well
          with setappend for set-like behavior: use setappend to add
          elements to "set" and remlist to remove
          elements from set.

          Match testing is case insensitive.

          No error is raised if thre is no matching element; input
          list is returned unchanged.

See Also: setappend

Return to command index

A1.171 remove_global


Purpose:  Remove a global variable (so it no longer exists)

Usage:    remove_global <variable_name>

Notes:    It is not necessary to declare variable as global first,
          and there is no error if no such global actually exists.

See Also: if_global_exists

Return to command index

A1.172 residual


Purpose:  Compute residuals for a maximized model and phenotypes file

Usage:    residual [solarfile] [-out <residualfile>] [-phen <phenfile>] 
                   [-fewest] [-needped]

          solarfile       solar maximization output file which is expected
                            in the current outdir.  The default is null0.out
                            which is created by the polygenic command.
                            The default directory is the current outdir,
                            but you may specify a relative pathname to
                            the current directory.

                          EVD2 models must have actual model currently
                          in memory (such as is the case immediately after
                          running polygenic).

                           If the "define" command is used to define the
                           names used for trait(s) or covariate(s), there
                            must be a model with the same rootname in
                            the same directory as the output file.  The
                            default is null0.mod.

                           Handling of the "scale" (and "noscale") commands
                            also requires the presence of the model with
                            the same rootname in the same directory as the
                            output file.  If this model is not present,
                            residual will finish but print a warning if
                            not inside a script.

          residualfile    new phenotypes file with trait 'residual' (the
                            default is 'residual.out' written to the working
                            directory).
          phenfile        the base phenotypes file; the default is to use
                            the currently loaded phenotypes file(s).
          -fewest         Copy fewest fields to residualfile (this would be
                            ID, FAMID (if required), trait, and residual.
                            The default is to include the above along with
                            all other (unused) variables from the phenfile.
          -needped        Only include individuals in the pedigree file.
                            (By default, all individuals in the phenotypes
                             file would be included, unless there is a
                             covariate including sex which require the
                             pedigree file.)

Example:
          solar> automodel pheno atrait
          solar> polygenic -s
          solar> residual

MOST IMPORTANT!

          This procedure requires a maximization output file.
          Unless otherwise specified, the default is assumed to be null0.out
          which is produced by the "polygenic" command.  If this is not
          what you want, you need to specify the maximization output file.
          You cannot specify a model file, that is insufficient.

Additional Notes:

          Univariate quantitative trait only.
          The trait or outdir must be specified first.
          Must be run in the directory in which pedigree was loaded.
          FAMID is included if present in both phenotypes and pedigree files.
          residualfile will be written in comma delimited format.
          This procedure does not handle hand-written 'mu' equations, only
            standard covariates.
          Not applicable to discrete traits.


Shortcuts: resi - residuals

Return to command index

A1.173 retext


solar::retextpc --

Purpose: Translate files with Old Mac line terminators to unix terminators

Usage:  retext <input-file> [<output-file>]    ;# for Old Mac files
        retextpc <input-file> [<output-file>]  ;# for PC files (if needed)

Notes:  retextpc is for converting Dos/Windows files to the unix
        format SOLAR likes best (however this is not usually needed for
        csv files from PC's which generally SOLAR can handle asis).

        SOLAR now identifies the old style Mac termination when opening
        phenotypes files.  These files MUST be translated because they
        are incompatible with the C programming library.  If you need to
        run the retext command, you will get this message:

            File old.txt has unsupported text line terminators
            Use retext command to fix file before using

        The full input filename must be given, including ".txt" extension
        if present.  If no output filename is specified, an additional
        ".tr" extension is added to the input filename.  If a file with
        the .tr extension already exists, it is overwritten.


Return to command index

A1.174 retextpc


Purpose: Translate files with Old Mac line terminators to unix terminators

Usage:  retext <input-file> [<output-file>]    ;# for Old Mac files
        retextpc <input-file> [<output-file>]  ;# for PC files (if needed)

Notes:  retextpc is for converting Dos/Windows files to the unix
        format SOLAR likes best (however this is not usually needed for
        csv files from PC's which generally SOLAR can handle asis).

        SOLAR now identifies the old style Mac termination when opening
        phenotypes files.  These files MUST be translated because they
        are incompatible with the C programming library.  If you need to
        run the retext command, you will get this message:

            File old.txt has unsupported text line terminators
            Use retext command to fix file before using

        The full input filename must be given, including ".txt" extension
        if present.  If no output filename is specified, an additional
        ".tr" extension is added to the input filename.  If a file with
        the .tr extension already exists, it is overwritten.


Return to command index

A1.175 round_epedigree


Purpose: Converts empirical pedigree values to rounded phi2 value that would
  be produced but a coded pedigree
         
Usage: round_pedigree -i <input empirical pedigree name> -o <modified empirical pedigree name>
                     -relation <max relationship number>

Example: round_epedigree  -i epedigree.csv -o modified_epedigree.csv -relation 5
        The value written to output are determined by finding the closest value in 
        the set of relationship values and 0.The max relationship value being 0.5^<max relation>. 

Return to command index

A1.176 rvi


Purpose: rvi command will calculate regional vulnerability index for a 
group of patients and controls that are coded by "DX" field. The details
of the calculation is provided in Kochunov et. 2019, PMID: 31352812.
         
Usage: rvi --modality <name of modality> --disorder <name of disorder> --out
<output file name>

In brief, data from controls (DX=0) is used to perform inverse normal, followed 
by Z-normalization after correcting for covariates setup by the ordinary covar 
command. Typically age and sex are used.

The phenotype file must include the standard names used by ENIGMA workflows. 
These names of expected phenotypes are provided in the cortical-effects.csv, 
subcortical-effects.csv and dtifa-effects.csv file in solar main directory.
The RVI can be calculated for three modalities (1st option) named as:cortical,
subcortical, and dtifa
Disorder names include: ssd, mdd, bp, ad 

Return to command index

A1.177 save


Purpose:  save <object-type> <arguments>

          save model <filename>

          More information is available under "help model"


Shortcuts: sav - save

Return to command index

A1.178 scale


Purpose:  scale a covariate variable, or disable default scaling

Usage:    scale <var> <center>  ; scale this variable to this center value
          noscale <var>         ; use 0 as center value disabling default
          scale                 ; show all non-default scaling in effect
          scale <var>           ; show scaling for this variable
          scale default <var>   ; return to default scaling for var
          scale default_all     ; return to default for all vars

          <var>     any covariate variable, might be used in interaction too
          <center>  real number

Notes:

    By default, SOLAR adjusts all covariate variables to the sample mean.
    Using the scale command, you can adjust any covariate variable to
    any other fixed value, or disable adjustment altogether (by adjusting
    to zero).

    The adjustment applies to the variable whether it appears in a simple
    covariate (such as "age") or an interaction covariate (such as
    "age*sex") or both.

    There is currently no way of scaling the trait variable, or scaling
    any variable by a factor.  Those features could be added in a future
    release.

    Scaling is saved with the model, and is superceded by whatever scaling
    is in effect with a new model.

Return to command index

A1.179 screencov


Purpose:  Perform polygenic analysis with covariate screening
            Same as 'polygenic -screen'

solar::sporadic --
solar::polygenic --

Purpose:  Perform polygenic, sporadic, and/or household analysis
            Calculate H2r, significance of H2r, and proportion of variance
              contributed by covariates.
            Optionally performs covariate screening (determine significance
              level of each covariate).

Usage:   polygenic [-screen] [-all] [-p | -prob <p>] [-fix <covar>]
                   [-testcovar <covar>] [-testrhoe] [-testrhog] [-testrhoc]
                   [-sporadic] [-keephouse] [-testrhop] [-rhopse] [-fphi]

         (screencov is an alias for 'polygenic -screen')
         (sporadic is an alias for 'polygenic -sporadic')

         Typically before giving this command, you will give trait,
         covariate, and house (if applicable) commands.  You will also load
         pedigree and phenotypes files if they have not already been loaded.

             solar> load pedigree ped
             solar> load phenotypes phen
             solar> trait hbp
             solar> covariate age sex age*sex smoke
             solar> polygenic -screen

         Alternatively, you may use the "automodel" command first to
         include all available phenotypes as covariates.  See note 2
         below and "help automodel".

         -screen   (or -s)  Perform covariate screening:
                   Calculate significance level for each covariate, and run
                   only the significant covariates in the final analysis.
                   An inclusive significance threshold of 0.1 is used,
                   but may be changed with the -prob option.  Covariates
                   may be locked in regardless of significance with the
                   -fix or -all options.

         (An alternative method of covariate analysis using bayesian
          model averaging is available with the command:
              bayesavg -covariates)

         -p        (or -prob)  p is the probability level for keeping
                   covariates as "significant."  The default is 0.1.
                   It is set to be generous so that covariates are not
                   removed unnecessarily.  (The probability levels for
                   H2r and C2 are fixed at 0.05, however, H2r is never
                   removed from the final model even if it judged to
                   be not significant, and C2 is only removed from the
                   model if it is zero in the final model and therefore
                   has no effect at all.)

         -fix      (or -f) "fix" (lock in) this particular covariate
                   regardless of significance level.  NOTE: a -fix or -f
                   qualifier is required for each covariate to be fixed,
                   for example:  -f age -f sex

         -all      (or -a) Keep all covariates in final anaysis regardless
                   of significance level.

         -testcovar <covar>  Test the probability of this covariate only.
                    All other covariates are fixed and unscreened.  This
                    argument is incompatible with -screen (screen all
                    covariates).  The tested covariate is not removed from
                    final model regardless of probability.  For -testcovar,
                    the default probability level for declared
                    "significance" is 0.05 (which can be changed with -p
                    option).  Also, the reported proportion of variance
                    is for the tested covariate only.

         -testrhoe  (Bivariate only)  Test significance of rhoe difference
                    from 0 by running model where rhoe is constrained to 0.
                    The p value is shown in the same line as the RhoE value.

         -testrhog  (Bivariate only)  Test significance of rhog differences
                    from zero and from 1 (if positive) or -1 (if negative).
                    Because there may be two p values, they are shown
                    in line(s) below the RhoG result and standard error.

         -testrhoc  (Bivariate Household only) Test significance of rhoc
                    differences from zero and 1 (if positive) and -1 (if
                    negative).  Because there may be two p values, they are
                    shown in line(s) below the RhoC result and std. error.

         -testrhop  (Bivariate polygenic only) Test significance of derived
                    estimate of phenotypic correlation differences
                    (difference from 0).

         -rhopse     (-testrhop must be specified also) Get standard error
                     of rhop, saved in model file rhop.mod and variable
                     SOLAR_RhoP_SE

         -sporadic  Only evaluate sporadic models, not polygenic.

         -keephouse Keep "household effect" C2 parameter in final model
                    even if it maximizes to zero in the best polygenic
                    (or sporadic) model.

         -residinor After maximizing final sporadic model (after covariate
                    testing, if that is done), residualize the final model
                    and inormalize the residual trait.  (Warning!  The
                    phenotypes file loaded at the end of analysis will be
                    the residual phenotypes file.)
           
          -fphi     Option to run polygenic using fphi function.

Notes:    (1) Output is written to directory selected by 'outdir' command,
          or, if none is selected, to a directory named by the trait.  This
          is called the "maximization output directory."  Polygenic results
          are in file named polygenic.out.  Important loglikelihoods and
          statistical computations are recorded in polygenic.out.logs.  If
          the -sporadic option is selected, the files are sporadic.out and
          sporadic.out.logs.  For univariate models, the residuals are
          computed and written to a file named polygenic.residuals (or
          sporadic.residuals), then the statistics of those residuals
          are written to a file named polygenic.residuals.stats (or
          sporadic.residuals.stats).  If the residual kurtosis is
          above 0.8, you get a special warning (see note 5 below).  You
          also get a special warning if the trait standard deviation is
          below 0.5, which is undesireable for numerical reasons.

          (2) Prior to running polygenic, you should set up the trait and
          covariates.  You may use the trait and covariate commands, or
          use the "automodel" command. "automodel" selects all variables
          otherwise unaccounted for in the phenotypes file as candidate
          covariates, and also sex and the standard interactions with
          sex and age.  (If you are unfamiliar with "automodel" it would
          be a good idea to examine the covariates afterwards with the
          covariates command...)

          (3) Beginning with Version 8.3.1, the phi2.gz matrix is
          loaded during maximization if phi2 is not already loaded
          by the user.  That was already being done for many kinds
          of maximization, just not the simplest univariate
          quantitative.  The AutoPhi2 modeltype brings back the
          original on-the-fly calculation of phi2, if no phi2.gz had
          been loaded.  That is often not what is intended, but
          sometimes people who had not saved their phi2.gz were
          relying on it.

          (4) If household effect (see "house") is in effect when the
          polygenic command is given, it will be included in the analysis.
          If the household parameter C2 is 0 in the household polygenic
          model, it will be removed from the final model regardless of
          whether "covariate screening" is performed, unless -keephouse
          is specified.  The p value for C2 will be computed (if C2 is
          nonzero), but the p value will not cause C2 to be removed from
          the final model. The p value of the C2 parameters is not
          computed for bivariate models.

          (5) If any covariates have been constrained by the user,
          certain tests are not allowed: the determination of total
          variance due to covariates, or the Leibler-Kullback R
          squared (done for discrete traits).  Also, such covariates
          are not included in the "screening" if the screening option
          is selected.

          (6) If you get the message about Residual Kurtosis being too high
          because it is above 0.8, there is danger of LOD scores  being
          estimated too high in a subsequent linkage analysis.  You should
          start over using either tdist or lodadj or inormal (see 
          documentation) to protect against this.  If you are already
          using tdist or lodadj, you may ignore this warning, but it would
          be fair to report both the Residual Kurtosis and the method
          you are using to deal with it.  We most strongly recommend
          inormal, which in conjunction with the define command creates
          an inverse normalized transformation of your trait(s).

          If there are no covariates, the Kurtosis is computed from the
          trait itself, and no "residuals" are computed.  The same warning
          threshold applies.  We define Kurtosis as 0 for a standard
          normal distribution; 3 has already been subtracted from the
          normalized 4th central moment.

          (7) The polygenic command only supports our "standard"
          parameterizations.  If you would like to use the esd,gsd,qsd
          parameterization, use the polygsd command (see "help polygsd"
          for more information) instead.

          (8) For bivariate polygenic models only, a derived estimate of
          RhoP, the phenotypic correlation, is displayed on terminal
          and written to polygenic.out.  This estimate is computed from the
          h2r's, rhog, and rhoe according to the following formula:

              sqrt(h2r(ti))*sqrt(h2r(tj))*rhog + 
                  sqrt(1-h2r(ti))*sqrt(1-h2r(tj))*rhoe

          To determine the significance of RhoP by comparing models with
          a rhop parameter and a rhop parameter constrained to zero, use
          the -testrhop option.  Additional models rhop.mod and rhop0.mod
          are written to the output directory.

          (9) The polygenic command creates global variables which may
          be accessed later (which is often useful in scripts).  The
          variables are:

              SOLAR_Individuals  number of individuals included in sample
              SOLAR_H2r_P        p value for h2r
              SOLAR_Kurtosis     residual trait kurtosis
              SOLAR_Covlist_P    list of p values for covariates
              SOLAR_Covlist_Chi  list of chi values for covariates
              SOLAR_RhoP         derived estimate of phenotypic correlation
                                   for bivariate polygenic models, {} if
                                   not calculated
              SOLAR_RhoP_P       -testrhop sets this to p value of rhop
                                   being nonzero
              SOLAR_RhoP_SE      -rhopse sets this to se value of rhop
              SOLAR_RhoP_OK      -testrhop sets this if likelihood of rhop
                                   parameterized model matches polygenic,
                                   as it should

          The covariate lists are created only if the -screen option
          is used.  All screened variables are included, regardless of
          whether they were retained in the final model.  Before you
          can access any of these variables in a script, you must
          use a "global" command.  For example:

              global SOLAR_Kurtosis
              if {$SOLAR_Kurtosis > 4} {puts "Very bad kurtosis!"}

          (10) The default is for the standard error option to be turned
          on (and temporarily off, when desireable for certain tests).
          However, if you turn the standard error option off before
          starting polygenic, it will remain off.
          

Shortcuts: scree - screencov

Return to command index

A1.180 selectfields


Purpose:  Select fields (columns) from data file(s) and copy to a new file

Usage:    selectfields [-noid] [<infile>]* [.] [-np] [<field-name>]* 
                       [-o <outfile>]  [-sample] [-list filename] [-noid]
                                       
          A optional period (aka dot) ends the list of filenames and starts
          the list of field names.  If there is no dot, the first argument
          is assumed to be the one and only data filename.  The currently
          loaded phenotypes files are automatically included at the end of
          the list of files.  If nothing precedes the dot, only the
          phenotypes files are used.  Fields found in multiple files default
          to the first file in which they are found, however a warning is
          given when this happens.  The -np argument forces the loaded
          phenotypes files to be ignored.  The -sample argument forces
          only the inclusion of individuals having all field values
          defined.  Otherwise, a record is written for every ID encountered
          in the file(s) from which data is read, however one or more
          data value(s) might be blank.  

          -list filename   Use all the field names in this file, listed
                           one per line.  These are appended to the list
                           of field names given in the command line, if
                           any.

          If the -noid switch is given, the old version of selectfiles
          is used.  This takes one and only one <infile> followed by a
          list of fieldnames, with no dot in between.  The only other
          option allowed is -o.  No ID field is required in the input
          file, and no ID field is written unless included in the list
          of fieldnames.  The loaded phenotypes file is not used unless
          that is the one file named.

          If not specified, <outfile> defaults to selectfields.out

          <field-names> follow rules for phenotypes files and are also
            affected by field command specifications.  For example,
            if you specify "ID" as field name, this would also match a
            field name "EGO" in the file.

          Input file may be either PEDSYS or Comma Delimited format.
          Output file is comma delimited.

Example:  selectfields phen.dat out.dat ID AGE -o age.dat

Return to command index

A1.181 selectrecords


Purpose:  Select records from a file and copy them to a new file

Usage:    selectrecords <infile> [<outfile>] [{<conditions>}]*

          If not specified, <outfile> defaults to selectrecords.out

          Each <condition> is a Tcl conditional expression which includes
          field names in the file preceded by dollar sign $.
          Field names are case insensitive (you need not match
          capitalization used in file itself).  Each condition
          must be enclosed in curly braces and spaced from other conditions
          if any.  

          Conditions may also include actual Tcl variables, preceded by $$
          Tcl variables are Case Sensitive.

          Simple examples are shown, but any valid Tcl expression operators
          and functions may be used, and expressions may be arbitrarily
          complex...they are evaluated by the Tcl expression parser, with
          the exception of special pre-substitution of $$ variables.
          Internally, $$ variables are upvar'd into a local variables having
          leading capital S.

          If a condition includes a non-existant field, it will never be
          satisfied, producing an empty result file.  (In future, error
          detection may be added.)  If a condition includes a undefined $$
          tcl variable, an error will result.

          Input file may be either PEDSYS or Comma Delimited format.
          Output file is comma delimited.

          If the first condition does not include any dollar signs,
          it must include spaces (for example, {1 == 1}).  No such
          requirement is made for subsequent conditions.  It seems pointless
          to have condition without dollar signs anyway; if no condition
          is given you get all records (the "null condition" is always true).
              
Example:  selectrecords phen.dat out.dat {$bmi > 0.3} {$famid == 10}

          for {set F 1} {$F < 100} {incr F} {
              selectrecords phen.dat out$F.dat {$bmi > 0.3} {$famid == $$F}
          }

Note:     Records are "selected" when they match ALL conditions given (unless
          condition includes a non-existing field or has other error).

Return to command index

A1.182 setappend


Purpose:  Append only new elements to a list (keeping it like a set)

Usage:    setappend <listname> element

Note:     The list is identified by name, and may be modified, as with
          lappend.

Example:  set friends "John Tom Laura Jeff"
          setappend friends Harald

See Also: remlist

Return to command index

A1.183 setxor


Purpose:  Perform exclusive-or (xor) on two sets (Tcl lists)

Usage:    setxor aset bset

Note:     If element appears multiple times in one list, but not in other,
          it will appear multiple times in output.

Return to command index

A1.184 shortcut


Purpose:  Show command shortcuts legal in scripts

Usage:    shortcut <command>


Shortcuts: shortc - shortcuts

Return to command index

A1.185 showproc


Purpose:  Show SOLAR procedure or write to a file

Usage:    showproc <procname> [<filename>]

If <filename> is not specified, procedure is displayed on terminal using
the 'more' pager.  If <filename> is specified, renamed proc is written
to that file.

This procedure will show any SOLAR procedure (script), whether built-in
or user-defined.  Some, but not all, built-in SOLAR commands
are implemented as scripts, and can be shown by this command.  Other
SOLAR commands are implemented in C++ and FORTRAN, and cannot be shown
by this command.

User-defined scripts must be used once before they can be shown.

The formatting shown by showproc may not be as pretty as it actually is
in the source file because it will concatenate lines which are extended
using backslash.  showproc is based on the Tcl command "info body" which
has this "feature."

To protect built-in procedures from being accidentally superceded
through the use of this command, the procedure name is suffixed with
".copy".  If you choose to edit the script, IT IS RECOMMENDED THAT
YOU DO NOT RENAME IT TO HAVE THE SAME NAME AS THE ORIGINAL PROCEDURE
UNLESS YOU REALLY KNOW WHAT YOU ARE DOING.  If you do that anyway,
it would probably be ignored.  SOLAR prevents you from overriding
built-in procedures by putting the directory containing the active
solar.tcl file to the front of the auto-load list.  Normally, that
directory is the SOLAR_BIN directory defined when SOLAR starts up.
Even if you did have a copy of the solar.tcl file in your working
directory when SOLAR started up, procedures might be resolved either
to the solar.tcl file or to separate script files in your working
directory, depending on which appears earlier in an alphabetical
list.

Before new procedures can be used in SOLAR you must restart SOLAR or give
the newtcl command.


Return to command index

A1.186 siminf


Purpose:  Simulate a fully-informative marker and compute its IBDs

Usage:    siminf -out <markerfile> -ibd <ibdfile>

              -out  Write simulated marker genotypes to this filename.
                    The default is 'siminf.out' in the current working
                    directory.

              -ibd  Write marker-specific IBDs for the simulated marker
                    to this filename.  The default is 'ibd.siminf' in the
                    current working directory.  The file will be gzipped.

Shortcuts: simin - siminf

Return to command index

A1.187 simqtl


Purpose:  Simulate a QTL and (optionally) a linked marker

Usage:    simqtl [-seed <seed>] [-inform] [-gfile <genotype_file>]

          simqtl -model

          simqtl -freq {<f_1> ...} [-mfreq {<f_1> ...} [-theta <theta>]]
                 [-ntrt <#traits>] -mean {{<m1_AA> <m1_Aa> <m1_aa>} ...}
                 -sdev {<sd_1> ...} [-cov {<cov1> ...}]
                 [-beta {{<b1_AA> <b1_Aa> <b1_aa>} ...}]
                 [-cmean {<cov1_mean> ...}] [-mage {<mean_age>}]
                 [-rhog {<gen_corr_2,1> <gen_corr_3,1> <gen_corr_3,2> ...}]
                 [-rhoe {<env_corr_2,1> <env_corr_3,1> <env_corr_3,2> ...}]
                 [-h2r {<h2r_1> ...}]

          simqtl -nloc <#QTLs> -nall {<nall_1> ...}
                 [-ntrt <#traits>] -mean {{<m1_AA> <m1_Aa> <m1_aa>} ...}
                 -sdev {<sd_1> ...} [-cov {<cov1> ...}]
                 [-beta {{<b1_AA> <b1_Aa> <b1_aa>} ...}]
                 [-cmean {<cov1_mean> ...}] [-mage {<mean_age>}]
                 [-rhog {<gen_corr_2,1> <gen_corr_3,1> <gen_corr_3,2> ...}]
                 [-rhoe {<env_corr_2,1> <env_corr_3,1> <env_corr_3,2> ...}]
                 [-h2r {<h2r_1> ...}]

          There are two steps in the simulation process: (1) specifying
          the simulation model, and (2) running the simulation. The first
          form of the command shown is used to run the simulation, and
          takes the following optional arguments:

            -seed    An integer seed for the random number generator.

            -inform  If this argument is given, the simulated marker
                     genotypes will be fully informative.

            -gfile   For models of the second type described below, QTL
                     genotypes are read from a file rather than simulated.
                     This argument specifies the name of this file.

          The simulated trait values are written to the file "simqtl.phn".
          A simulated trait value will not be assigned to any individual
          who has an unknown age, or who is missing data for any other
          covariate specified in the simulation model. If QTL genotypes
          are simulated, they will be written to the file "simqtl.qtl".
          If the model includes a linked marker, the simulated marker
          genotypes are written to "simqtl.mrk". Two additional files are
          created and used by this command: "simqtl.dat" and "simqtl.par",
          which contain pedigree/covariate data and the model parameters,
          respectively. All of these files are created in the current
          working directory.

          If QTL genotypes are read from a file, that file must contain
          an ID field, a FAMID field (if required - see the documentation
          for marker files), and, for each QTL, a field containing the
          QTL genotype. Unlike SOLAR marker genotypes in general, the
          QTL genotypes must have integer alleles numbered consecutively
          beginning with 1. Also, if there are multiple QTLs, the position
          of the alleles is significant. For example, the QTL genotypes
          1/3 and 2/1 are combined to form the two-locus genotype 12/31,
          while genotypes 3/1 and 2/1 yield the two-locus genotype 32/11.

          The second form of the command displays the simulation model
          parameters.

          The remaining forms of the command are for the two types of
          simulation model that may be specified. In the first model, a
          single QTL and, optionally, a single linked marker are simulated.
          One or more correlated quantitative traits will be generated,
          along with a polygenic background. Adjustments may be made to
          the trait means for covariate effects. The covariates sex and
          age are always included although no adjustments need be specified
          for these covariates. The sex field is a required SOLAR field
          and so it is guaranteed to be available. The age field is taken
          from the phenotypes file, if one has been loaded. The name of
          the age field must be "AGE". It is not an error if there is no
          age field in the phenotypes file. The model will still contain
          age correction terms (which should be set to zero), but obviously
          no adjustment to the trait mean involving age can be made.
          If adjustments to the mean are made for sex and/or age, then
          betas must be specified for each of 5 effects: sex, male age,
          female age, male age^2, and female age^2, in that order. The
          parameters for this model are:

            -freq    The frequency of QTL alleles 1, 2, ,,,, N-1 where
                     the QTL has N alleles.

            -mfreq   The frequency of marker alleles 1, 2, ..., N-1 where
                     the marker has N alleles.

            -theta   The recombination fraction between the QTL and the
                     marker. The default value is 0, i.e. the QTL and marker
                     are fully linked.

            -ntrt    The number of traits controlled by the QTL. The default
                     value is 1.

            -mean    For each trait, a list of genotypic means. Genotypes
                     are ordered as follows: 1/1, 2/1, 2/2, 3/1, 3/2, ...
                     That is, the mean for genotype i/j, i >= j, is the
                     k-th element in the list, where k is given by

                          k = i*(i - 1)/2 + j

                     Because phase is not considered, genotypes i/j and j/i
                     are the same.

            -sdev    For each trait, the within-genotype phenotypic standard
                     deviation.

            -cov     A list of covariates, in addition to sex and age, for
                     which adjustments to the trait mean(s) will be made.

            -beta    For each trait, a set of lists, one for each covariate
                     including sex and age, of genotype-specific adjustments
                     to the trait mean. Genotype order is the same as for
                     genotypic means. If no betas are specified, they will
                     all default to 0, i.e. no covariate effects. As noted
                     above, sex and age together require 5 betas for each
                     genotype. If the betas for a particular trait and
                     covariate are not genotype-specific, the corresponding
                     list can be shortened to a single value; this value
                     will be used for every genotype.

            -cmean   For each covariate other than sex and age, a mean
                     value to subtract from the covariate before applying
                     a covariate correction to the trait means.

            -mage    A mean age to be subtracted before applying the age
                     correction to the trait means.

            -rhog    For each pair of traits, the genetic correlation
                     between those two traits. If there are N traits, the
                     order of the pairs is (2,1), (3,1), (3,2), ..., (N,1),
                     (N,2), ..., (N,N-1). The default is no genetic
                     correlation.

            -rhoe    For each pair of traits, the environmental correlation
                     between those two traits. The default is no
                     environmental correlation.

            -h2r     For each trait, the residual heritability expressed as
                     the fraction of trait variance after the QTL effect has
                     been accounted for. The default is no residual
                     heritability.


          In the second type of model, there may be multiple QTLs. The QTL
          genotypes are read from a file rather than simulated. Parameters
          unique to this model are:

            -nloc    The number of QTLs.

            -nall    A list of the number of alleles at each QTL.

          The remaining parameters are the same as in the first model type.
          The order of multi-locus genotypes is analogous to the single
          locus case. The multi-locus genotype i/j, i >= j, is in the k-th
          position, where k is given by

              k = i*(i - 1)/2 + j

          The i and j refer to the i-th and j-th multi-locus haplotypes.
          Haplotypes are ordered so that the alleles of the last locus
          vary the fastest, while the alleles of the first locus vary the
          slowest. For example, given three bi-allelic loci, the order of
          the three-locus haplotypes is

              111, 112, 121, 122, 211, 212, 221, 222

          The order of the three-locus genotypes is then

              111/111, 112/111, 112/112, 121/111, 121/112, 121/121, ...


Examples:

       1. Simulate a QTL with two alleles, call them A and a, where the
          frequency of allele A is 0.7. A single trait will be simulated
          for which the mean of genotype AA is 90, the mean of genotype
          Aa is 100, and the mean of genotype aa is 120. The trait will
          have a within-genotype standard deviation of 10, and a residual
          heritability of 0.3. A marker with 5 alleles of equal frequency
          will be generated which has a recombination fraction of 0.05
          with the QTL. The required commands are shown below - one to
          create the simulation model, another to actually perform the
          simulation. The first command has been broken into two lines
          to avoid line-wrapping, but must actually be entered as a
          single line.

              solar> simqtl -freq .7 -mfreq {.2 .2 .2 .2} -theta .05
                     -mean {90 100 120} -sdev 10 -h2r .3
              solar> simqtl


       2. Simulate a QTL with 3 alleles; the allele frequencies are 0.5,
          0.3, and 0.2. There is no linked marker. There are two traits
          associated with this QTL. Sex and age have an effect on the
          first trait; there is no sex-by-age interaction or second-order
          age effect. The traits are correlated both genetically and
          environmentally. A mean population age of 40 is subtracted
          prior to the age correction to the mean of the first trait.
          When the simulation is run, the random number generator is
          seeded with the integer value 12345.

              solar> simqtl -freq {.5 .3} -ntrt 2
                     -mean {{10 15 12 20 18 30} {50 55 60 60 55 80}}
                     -sdev {2.5 10} -h2r {.2 .65}
                     -beta {{{-1.2 .1 -.5 1.4 2 -.5} {2.4 3 1.6 -4 0 -1}
                             {2.4 3 1.6 -4 0 -1} {0} {0}}
                            {{0} {0} {0} {0} {0}}}
                     -mage 40 -rhog .7 -rhoe .4
              solar> simqtl -seed 12345


       3. Simulate a quantitative trait controlled by two QTLs. The
          first QTL has 3 alleles, and the second QTL has 2 alleles.
          There are 6 two-locus haplotypes, so we have a total of 21
          two-locus genotypes in the order

              11/11, 12/11, 12/12, 21/11, 21/12, 21/21, 22/11, 22/12,
              22/21, 22/22, 31/11, 31/12, 31/21, 31/22, 31/31, 32/11,
              32/12, 32/21, 32/22, 32/31, 32/32

          When the simulation is run, the two-locus genotypes are read
          from the file "2locgtyp".

              solar> simqtl -nloc 2 -nall {3 2}
                     -mean {26 31 36 28 33 30 33 38 35 40 31 36 33 38 36
                            36 41 38 43 41 46}
                     -sdev 2.8
              solar> simqtl -gfile 2locgtyp

Shortcuts: simqt - simqtl

Return to command index

A1.188 slod


Purpose:  Calculate slod (score-based LOD equivalent) on current model

Usage:    slod newparam

Example:  slod h2q1


Shortcuts: slod - slod

Return to command index

A1.189 snp


Purpose:  Process SNP data.

Usage:    load snp [-xlinked] <genofile> [<locfile>] ; loads SNP data
          snp show                        ; displays summary of SNP data
          snp covar [-nohaplos] [-impute] ; prepare genotype covariates file
          snp qtld                        ; prepare QTLD covariates file
          snp ld [-window <window>] [-overwrite]
                 [-plot [-absrho] [-file <psfile>]
                  [-title <title>] [-date] [-gray]]
                                          ; compute linkage disequilibrium
          snp effnum [<method>] [-alpha <alpha>]
                                          ; use <method> to compute the
                                            effective number of SNPs
          snp unload                      ; unload SNP data

          SNP genotype data is treated as a special case of marker data.
          The file containing SNP genotypes must be in the same format as
          any SOLAR marker data file, and the SOLAR 'marker' and 'freq'
          commands can be used to process the SNP genotype data.  However,
          the following restriction applies to SNP genotype data: there
          must be exactly two allelic types present in the data for each
          SNP.  If a SNP has only a single allele, i.e. the SNP is not
          polymorphic, it will be loaded but cannot be used for further
          analysis.  If a SNP with more than two alleles is encountered,
          the 'load snp' command will fail.  After a successful load, a
          file named 'snp.typed' is created, which contains a field, named
          nGTypes, giving the number of SNPs genotyped for each pedigree
          member.  This field is empty for untyped individuals.

          The locations of the SNPs can be read from a standard SOLAR map
          file by the 'load snp' command.  Each location must be given as
          an integer number of basepairs.  Only those SNPs that appear in
          the map file will be included in SNP processing commands.  While
          it is not necessary that every SNP in the marker file be present
          in the map, it is required that genotype data be loaded for all
          SNPs in the map file.  If the user does not specify a map file
          in the 'load snp' command but a map file has already been loaded,
          that map file will be used.  Otherwise, a dummy map file, named
          'snp.map', will be created in which consecutive basepair locations
          have been assigned to the SNPs, starting at zero.

          The 'snp show' command displays a summary of the SNP genotype
          data.  The contents of this display are also saved in the file
          'snp.show.txt'.  The information displayed includes SNP name,
          location, number of typed individuals, and allele frequencies.
          The frequency information is ordered so that the common (higher
          frequency) allele is listed first.  If available, the standard
          error of the allele frequency estimates and the p-value for a
          test of Hardy-Weinberg equilibrium are also displayed.  The
          allele frequency standard errors are computed by the 'freq mle'
          command.  The HWE test is performed when the '-hwe' option is
          included in the 'freq mle' command.

          The 'snp covar' command produces a file, 'snp.genocov', in which
          the SNP genotypes have been recoded to be suitable for use as
          covariates in a SOLAR measured genotype, QTLD, or Bayesian
          QTN analysis.  This file includes a field for each SNP, named
          snp_<name>, where <name> is the SNP name.  Genotypes are coded
          as the number of copies of the rarer allele: 0, 1, or 2.  The
          'snp covar' command can be run in one of three different ways.
          By default, SNP haplotypes generated by a haplotype estimation
          procedure, e.g.  SimWalk2, are used to impute as many missing
          genotypes as possible.  The haplotypes are read from the file
          'snp.haplotypes', which is created from the output of the
          haplotype estimation procedure using the 'snphap' command.

          If SNP haplotypes are not available, the '-nohaplos' option
          can be given to the 'snp covar' command, in which case the
          covariates are generated solely from the genotype data.

          The third method for generating covariates, invoked with the
          '-impute' option, is to extend the genotype imputation of the
          default method.  In this case, a missing genotype which cannot
          be assigned exactly from the haplotypes, is imputed from the
          weighted average of all haplotypes which are consistent with
          that individual's genotype and estimated haplotype data, where
          the weights are the estimated haplotype frequencies.  These
          frequencies are read from the file 'snp.haplofreqs', which is
          created by the 'snphap' command.  This method of generating
          the covariates ensures that each individual has complete data,
          i.e. there are no missing covariates.  Because covariates will
          have been imputed for all pedigree members, whether they were
          genotyped or not, it may be desirable to include in an analysis
          only those individuals for whom at least some minimum number
          of SNPs were typed.  This can be done by selecting on the
          nGTypes field, which is taken from the file 'snp.typed' and
          automatically joined to the covariates file.

          In addition to the file 'snp.genocov', the 'snp covar' command
          creates a file 'snp.geno-list', which contains the names of
          the covariates, one per line.  This file can be used to
          specify the covariates to be included in various association
          analyses, e.g. the 'bayesavg' command.

          The 'snp qtld' command generates another type of covariates
          file, in this case the covariates required for a SOLAR QTLD
          analysis.  This file, which is named 'snp.qtldcov', contains
          four covariates for each SNP: b_<name>, w_<name>, b2_<name>,
          and w2_<name>, where <name> is the name of the SNP.  As with
          the genotypes covariates file, the nGTypes field from the
          file 'snp.typed' is automatically included and can be used
          to exclude untyped individuals from an analysis.

          The 'snp ld' command computes the pairwise correlation among
          the SNP genotypes.  This measure can be used to identify
          those SNPs which are in linkage disequilibrium.  The signed
          pairwise correlations are written to the file 'snp.ld.dat'.
          If 'snp.ld.dat' already exists, the correlations are not
          recomputed unless the '-overwrite' option is specified.

          The '-window' option limits the number of pairwise correlations
          that will be computed by the 'snp ld' command.  Only the pairs
          of SNPs separated by no more than the number of basepairs in
          the window will be considered.  A map file containing basepair
          locations must be loaded in order to use this option.

          When the '-plot' option is added to the 'snp ld' command, a
          PostScript LD plot will be displayed.  If the file 'snp.ld.dat'
          already exists, the genotypic correlations are not recomputed.
          The plot is saved in the file specified by the '-file' option',
          or, by default, in the file 'snp.ld.ps'.  The LD measure shown
          in the plot is the square of the genotypic correlation (rho^2),
          or, if the '-absrho' option is specified, the absolute value of
          the correlation.  A plot title, enclosed in quotes, can be
          supplied with the '-title' option.  The '-date' option adds a
          date stamp to the plot.  The '-gray' option produces a gray-scale
          version of the plot.

          The 'snp effnum' command uses the specified method to estimate
          the effective number of SNPs based on the pairwise genotypic
          correlations.  This is an estimate of the number of independent
          tests in an association analysis utilizing these SNPs, which can
          be used to determine an appropriate significance level for an
          association analysis utilizing these SNP data.  Currently, the
          following methods are implemented:

            mosk   Moskvina & Schmidt    (default)
            liji   Li & Ji

          For example: snp effnum liji

          The method of Moskvina & Schmidt is the more conservative of the
          two and is the default.  The Li & Ji method entails computing
          the eigenvalues of the genotypic correlation matrix.  The number
          of SNPs cannot exceed the number of genotyped individuals (i.e.,
          the number of rows in the correlation matrix) or the matrix will
          be singular.

          The 'snp effnum' command also returns the p-value required for
          a result to be considered statistically significant after the
          correction for multiple testing is applied.  This p-value is a
          function of the effective number of SNPs and the experiment-wide
          significance threshold (target alpha).  The '-alpha' option
          specifies the target alpha (default value = .05).


Return to command index

A1.190 snphap


Purpose:  Compute SNP haplotypes and haplotype frequencies.

Usage:    snphap show               ; displays summary of SNP haplotypes
          snphap prep <program>     ; prepares input files needed to compute
                                    ; SNP haplotypes using <program>, where
                                    ; <program> is simwalk (sw) or merlin
          snphap import <program> [-file <filename>] [-overwrite]
                                    ; imports SNP haplotypes from an output
                                    ; file computed by <program>, where
                                    ; <program> is simwalk (sw) or merlin
          snphap freq prep          ; prepares input file needed to compute
                                    ; SNP haplotype frequencies using the
                                    ; program snphap
          snphap freq import [-file <filename>] [-overwrite]
                                    ; imports SNP haplotype frequencies
                                    ; from an output file generated by the
                                    ; program snphap
          snphap count              ; computes SNP haplotype frequencies by
                                    ; counting haplotypes present in data
          snphap covar              ; prepare haplotype covariates file

          The 'snphap' command assumes that the 'snp load' command has been
          used to load SNP genotype data.  The main purpose of this command
          is to prepare the SNP haplotypes and haplotype frequencies files
          used by the 'snp covar' command.  SOLAR does not do haplotyping
          or haplotype frequency estimation itself, but rather relies on
          these functions being provided by external programs.

          The 'snphap prep' command is used to generate the input files for
          a haplotyping run using either SimWalk2 or Merlin.  The output
          file created by the haplotyping procedure is then imported into
          SOLAR with the 'snphap import' command, which creates the file
          'snp.haplotypes'.  The '-overwrite' option guards against the
          unintentional replacement of an existing haplotypes file.

          Haplotype frequencies can be generated in two ways.  The program
          SNPHAP can be used to compute frequency estimates using an EM
          algorithm with the assumption that all individuals are unrelated.
          The 'snphap freq prep' command prepares the input file required
          by SNPHAP.  The 'snphap freq import' command processes the SNPHAP
          output to create the file 'snp.haplofreqs'.  Alternatively, the
          'snphap count' command generates the haplotype frequencies file
          by simply counting the haplotypes present in the haplotypes file
          'snp.haplotypes'.  The haplotype frequencies file is sorted in
          descending order of frequency, so that the most common haplotype
          appears first.

          The 'snphap covar' command generates a haplotype-based covariates
          file suitable for use in a SOLAR analysis.  This file, which is
          named 'snp.haplocov' includes a covariate field for each of the
          haplotypes present in the file 'snp.haplotypes'.  These fields are
          named hap_<hap#> where <hap#> is the position of the corresponding
          haplotype in the file 'snp.haplofreqs'.  For example, the field
          corresponding to the most frequent haplotype is named hap_1 and
          the value in this field is the number of copies of this haplotype
          that an individual possesses.  Covariates are defined only for
          those individuals with two complete haplotypes in 'snp.haplotypes'
          are included in the haplotype covariates file.


Return to command index

A1.191 solarfile


Purpose:  Read data file applying "field" name mapping

Usage:    Same as tablefile (see) but using "solarfile" command name, plus:

          solarfile $tablenum establish_name <generic-name>

          establish_name returns the actual field name applied or mapped
          the the generic name.  For example, the generic-name "id" might
          actually be "ego" in the file, or a name mapped to "id" using
          the field command.  The generic-names are listed by the field
          command.

Notes:    Intended for use in scripts.

          This command extends the "tablefile" command by allowing for
          user-supplied field name mapping using the "field" command.
          This also supports both default field names for basic identifiers:
          id,fa,mo and ego,sire,dam.

Return to command index

A1.192 solarmodel


Purpose:  Check SOLAR version compatibility of model

solarmodel appears a the top of all new model files and identifies the
model version.  If the version is incompatible with the current
version, an error message is displayed.

solartcl appears at the top of all upgraded script files.  SOLAR
programmers are encoured to use solartcl as well.

To upgrade solar models, use the "upgrade" command.

Shortcuts: solarm - solarmodel

Return to command index

A1.193 solartcl


Purpose:  Check SOLAR version compatibility of tcl file

solar::solarmodel --

Purpose:  Check SOLAR version compatibility of model

solarmodel appears a the top of all new model files and identifies the
model version.  If the version is incompatible with the current
version, an error message is displayed.

solartcl appears at the top of all upgraded script files.  SOLAR
programmers are encoured to use solartcl as well.

To upgrade solar models, use the "upgrade" command.

Shortcuts: solart - solartcl

Return to command index

A1.194 solarversion


Purpose:  Display the SOLAR program version

Usage:    solarversion


Shortcuts: solarv - solarversion

Return to command index

A1.195 split_class_file


Purpose: Splits a csv file based on a class column
         
Usage: split_class_file <csv file name>

Example:  split_class_file pheno.csv
pheno.csv is split into 0_pheno.csv, 1_pheno.csv,....,i_pheno.csv

Return to command index

A1.196 sporadic


solar::polygenic --

Purpose:  Perform polygenic, sporadic, and/or household analysis
            Calculate H2r, significance of H2r, and proportion of variance
              contributed by covariates.
            Optionally performs covariate screening (determine significance
              level of each covariate).

Usage:   polygenic [-screen] [-all] [-p | -prob <p>] [-fix <covar>]
                   [-testcovar <covar>] [-testrhoe] [-testrhog] [-testrhoc]
                   [-sporadic] [-keephouse] [-testrhop] [-rhopse] [-fphi]

         (screencov is an alias for 'polygenic -screen')
         (sporadic is an alias for 'polygenic -sporadic')

         Typically before giving this command, you will give trait,
         covariate, and house (if applicable) commands.  You will also load
         pedigree and phenotypes files if they have not already been loaded.

             solar> load pedigree ped
             solar> load phenotypes phen
             solar> trait hbp
             solar> covariate age sex age*sex smoke
             solar> polygenic -screen

         Alternatively, you may use the "automodel" command first to
         include all available phenotypes as covariates.  See note 2
         below and "help automodel".

         -screen   (or -s)  Perform covariate screening:
                   Calculate significance level for each covariate, and run
                   only the significant covariates in the final analysis.
                   An inclusive significance threshold of 0.1 is used,
                   but may be changed with the -prob option.  Covariates
                   may be locked in regardless of significance with the
                   -fix or -all options.

         (An alternative method of covariate analysis using bayesian
          model averaging is available with the command:
              bayesavg -covariates)

         -p        (or -prob)  p is the probability level for keeping
                   covariates as "significant."  The default is 0.1.
                   It is set to be generous so that covariates are not
                   removed unnecessarily.  (The probability levels for
                   H2r and C2 are fixed at 0.05, however, H2r is never
                   removed from the final model even if it judged to
                   be not significant, and C2 is only removed from the
                   model if it is zero in the final model and therefore
                   has no effect at all.)

         -fix      (or -f) "fix" (lock in) this particular covariate
                   regardless of significance level.  NOTE: a -fix or -f
                   qualifier is required for each covariate to be fixed,
                   for example:  -f age -f sex

         -all      (or -a) Keep all covariates in final anaysis regardless
                   of significance level.

         -testcovar <covar>  Test the probability of this covariate only.
                    All other covariates are fixed and unscreened.  This
                    argument is incompatible with -screen (screen all
                    covariates).  The tested covariate is not removed from
                    final model regardless of probability.  For -testcovar,
                    the default probability level for declared
                    "significance" is 0.05 (which can be changed with -p
                    option).  Also, the reported proportion of variance
                    is for the tested covariate only.

         -testrhoe  (Bivariate only)  Test significance of rhoe difference
                    from 0 by running model where rhoe is constrained to 0.
                    The p value is shown in the same line as the RhoE value.

         -testrhog  (Bivariate only)  Test significance of rhog differences
                    from zero and from 1 (if positive) or -1 (if negative).
                    Because there may be two p values, they are shown
                    in line(s) below the RhoG result and standard error.

         -testrhoc  (Bivariate Household only) Test significance of rhoc
                    differences from zero and 1 (if positive) and -1 (if
                    negative).  Because there may be two p values, they are
                    shown in line(s) below the RhoC result and std. error.

         -testrhop  (Bivariate polygenic only) Test significance of derived
                    estimate of phenotypic correlation differences
                    (difference from 0).

         -rhopse     (-testrhop must be specified also) Get standard error
                     of rhop, saved in model file rhop.mod and variable
                     SOLAR_RhoP_SE

         -sporadic  Only evaluate sporadic models, not polygenic.

         -keephouse Keep "household effect" C2 parameter in final model
                    even if it maximizes to zero in the best polygenic
                    (or sporadic) model.

         -residinor After maximizing final sporadic model (after covariate
                    testing, if that is done), residualize the final model
                    and inormalize the residual trait.  (Warning!  The
                    phenotypes file loaded at the end of analysis will be
                    the residual phenotypes file.)
           
          -fphi     Option to run polygenic using fphi function.

Notes:    (1) Output is written to directory selected by 'outdir' command,
          or, if none is selected, to a directory named by the trait.  This
          is called the "maximization output directory."  Polygenic results
          are in file named polygenic.out.  Important loglikelihoods and
          statistical computations are recorded in polygenic.out.logs.  If
          the -sporadic option is selected, the files are sporadic.out and
          sporadic.out.logs.  For univariate models, the residuals are
          computed and written to a file named polygenic.residuals (or
          sporadic.residuals), then the statistics of those residuals
          are written to a file named polygenic.residuals.stats (or
          sporadic.residuals.stats).  If the residual kurtosis is
          above 0.8, you get a special warning (see note 5 below).  You
          also get a special warning if the trait standard deviation is
          below 0.5, which is undesireable for numerical reasons.

          (2) Prior to running polygenic, you should set up the trait and
          covariates.  You may use the trait and covariate commands, or
          use the "automodel" command. "automodel" selects all variables
          otherwise unaccounted for in the phenotypes file as candidate
          covariates, and also sex and the standard interactions with
          sex and age.  (If you are unfamiliar with "automodel" it would
          be a good idea to examine the covariates afterwards with the
          covariates command...)

          (3) Beginning with Version 8.3.1, the phi2.gz matrix is
          loaded during maximization if phi2 is not already loaded
          by the user.  That was already being done for many kinds
          of maximization, just not the simplest univariate
          quantitative.  The AutoPhi2 modeltype brings back the
          original on-the-fly calculation of phi2, if no phi2.gz had
          been loaded.  That is often not what is intended, but
          sometimes people who had not saved their phi2.gz were
          relying on it.

          (4) If household effect (see "house") is in effect when the
          polygenic command is given, it will be included in the analysis.
          If the household parameter C2 is 0 in the household polygenic
          model, it will be removed from the final model regardless of
          whether "covariate screening" is performed, unless -keephouse
          is specified.  The p value for C2 will be computed (if C2 is
          nonzero), but the p value will not cause C2 to be removed from
          the final model. The p value of the C2 parameters is not
          computed for bivariate models.

          (5) If any covariates have been constrained by the user,
          certain tests are not allowed: the determination of total
          variance due to covariates, or the Leibler-Kullback R
          squared (done for discrete traits).  Also, such covariates
          are not included in the "screening" if the screening option
          is selected.

          (6) If you get the message about Residual Kurtosis being too high
          because it is above 0.8, there is danger of LOD scores  being
          estimated too high in a subsequent linkage analysis.  You should
          start over using either tdist or lodadj or inormal (see 
          documentation) to protect against this.  If you are already
          using tdist or lodadj, you may ignore this warning, but it would
          be fair to report both the Residual Kurtosis and the method
          you are using to deal with it.  We most strongly recommend
          inormal, which in conjunction with the define command creates
          an inverse normalized transformation of your trait(s).

          If there are no covariates, the Kurtosis is computed from the
          trait itself, and no "residuals" are computed.  The same warning
          threshold applies.  We define Kurtosis as 0 for a standard
          normal distribution; 3 has already been subtracted from the
          normalized 4th central moment.

          (7) The polygenic command only supports our "standard"
          parameterizations.  If you would like to use the esd,gsd,qsd
          parameterization, use the polygsd command (see "help polygsd"
          for more information) instead.

          (8) For bivariate polygenic models only, a derived estimate of
          RhoP, the phenotypic correlation, is displayed on terminal
          and written to polygenic.out.  This estimate is computed from the
          h2r's, rhog, and rhoe according to the following formula:

              sqrt(h2r(ti))*sqrt(h2r(tj))*rhog + 
                  sqrt(1-h2r(ti))*sqrt(1-h2r(tj))*rhoe

          To determine the significance of RhoP by comparing models with
          a rhop parameter and a rhop parameter constrained to zero, use
          the -testrhop option.  Additional models rhop.mod and rhop0.mod
          are written to the output directory.

          (9) The polygenic command creates global variables which may
          be accessed later (which is often useful in scripts).  The
          variables are:

              SOLAR_Individuals  number of individuals included in sample
              SOLAR_H2r_P        p value for h2r
              SOLAR_Kurtosis     residual trait kurtosis
              SOLAR_Covlist_P    list of p values for covariates
              SOLAR_Covlist_Chi  list of chi values for covariates
              SOLAR_RhoP         derived estimate of phenotypic correlation
                                   for bivariate polygenic models, {} if
                                   not calculated
              SOLAR_RhoP_P       -testrhop sets this to p value of rhop
                                   being nonzero
              SOLAR_RhoP_SE      -rhopse sets this to se value of rhop
              SOLAR_RhoP_OK      -testrhop sets this if likelihood of rhop
                                   parameterized model matches polygenic,
                                   as it should

          The covariate lists are created only if the -screen option
          is used.  All screened variables are included, regardless of
          whether they were retained in the final model.  Before you
          can access any of these variables in a script, you must
          use a "global" command.  For example:

              global SOLAR_Kurtosis
              if {$SOLAR_Kurtosis > 4} {puts "Very bad kurtosis!"}

          (10) The default is for the standard error option to be turned
          on (and temporarily off, when desireable for certain tests).
          However, if you turn the standard error option off before
          starting polygenic, it will remain off.
          

Return to command index

A1.197 sporadic_normalize


Purpose: Residualizes trait with sporadic model then inormalizes the residuals.

Usage: sporadic_normalize -out <output file name> [optional: 
  -list <file containing a list of traits> -class <perform by class> ]

	sporadic_normalize is used for factoring out covariates and inormalizing traits prior 
  to performing other calculations.  It can be used in two ways.  One way is to 
  select a trait with the trait command then select covariates with covariate command 
  prior to calling this command.  The second way requires the use of the -header option,
  which is a file containing the list traits separated white space.  When the -header
  option is used sporadic_normalize residualizes and inormalizes all the traits listed
  within the header file.  Covariates are still selected with covariate command before
  calling this command.  If the phenotype contains a class column and you wish to 
  perform the calculations by class then use the -class option by listing the classes
  separated by commas, for example -class 0,1,2,3 . 	

Return to command index

A1.198 sporclass


solar::polyclass -- (EXPERIMENTAL)

Purpose:  Set up polygenic model with class specific parameterization

Usage:    polyclass [-g] [-intrait] [-incovar] [<class-start>[-<class-end>]]+
                    [-comb] [-maxi] [-rincovar] [-maxsnp <snp_name>]
          sporclass [-g] [-intrait] [-incovar] [<class-start>[-<class-end>]]+
                    [-comb] [-maxi] [-rincovar] [-maxsnp <snp_name>]

          -g   Use global phenotypic values to set parameter adjustments
               (otherwise, means are determined for each class)

          -intrait  inormalize trait values on a per-class basis
          -resmax inormalize residual values in place of traits
          -incovar  (NOT WORKING IN version 7.1.2) inormalize covar values
                    on a per-class basis (only used for simple linear
                    covariates, no interactions or exponents)
          -comb     all classes combined model
          -max      after building the model, maximize it

          -maxsnp <snp_name>  Maximize and include  snp_name as covariate
                       in the model and determine statistics for it: beta,
                       beta se, chi, p, and variance explained (varexp).  
                       H2r's are reported for the models with and
                       without the snp.
          -append   Append results to existing output file(s) if any

Short Example:

           trait q4
           covariate age sex
           polyclass 1-3 9
           maximize -q

Notes: One phenotypes file must have a field named "class" which defines
       the class value for each person in the sample.

       Class specific parameters are given names with _c<class> appended.

       User covariates are transformed into class-specific mu addends.
       All individuals in sample must have all variables specified as
       covariates.

       After choosing trait and covariates, do either sporclass or
       polyclass.  You cannot do a second polyclass on a sporclassed model
       to make it polygenic.

       Unbalanced covariates for multivariate traits are not supported.
       This is different from ordinary covariate behavior for multivariate
       traits--which permits covariates to be missing in the sample if they
       are specific to a missing trait.

       A defined pseudo-covariate named "blank_classes()" restricts the
       sample to the union of all classes specified.

       The maximized model is asved in the output directory as
       polyclassmax.mod with output file polyclassmax.out.  Note that if
       -intrait option is selected, trait name and default output
       directory will have leading i_ prefix (for the inormalization).
       If the -resmax option is selected, the trait will be named
       "residual" or "i_residual" if -intrait is also selected.


Return to command index

A1.199 spormod


Purpose:  Set up a sporadic model with the standard parameters

Usage:    spormod

Notes:    There are no arguments.  You must have previously loaded the
          phenotypes file, selected the trait, and specified the
          covariates.

          Household effects are suspended.  If you want a 'household'
          model, give the spormod command first, then the 'house' command.

          The starting lower bound for e2 is controlled by e2lower.

          Normally you do not use this command directly, but instead use 
          the "polygenic" command to do a complete polygenic analysis,
          which maximizes a sporadic model which was set up using this
          command.  See the tutorial in Chapter 3.

Shortcuts: spor - spormodel

Return to command index

A1.200 startclock


solar::stopclock

Purpose: simple local timer (elapsed time)

Usage: startclock;<timed command>;stopclock

See Also: timediff

Return to command index

A1.201 stats


Purpose:  Get and/or show statistics for any variable in a file

Usage:    stats [<variable>+ | -all [-file <filename>]] [-q] [-return]
                [-out <outfile>] -sample

          -all     show stats for all variables in phenotypes file
          -return  do not write output file, return list of stats;
                     use stats_get to parse return list
          -q       do not display to terminal
          -out     specify alternate output filename; also returns list
                   of stats
          -sample  Use only the sample of the current model (there must be
                   a current model, or at least a defined trait, and 
                   statistics can only be computed for the trait(s) and
                   covariate(s) in that model)

          The default variable is the current trait, and the default
          filename is the currently loaded phenotypes file.  You may also
          specify one or more variables.

          Results are written to stats.out in the CURRENT WORKING DIRECTORY.
          (Not the usual output directory, since the trait need not be set.)
          They are also displayed on the terminal.

          The statistics computed are mean, minimum, maximum, standard
          deviation, skewness, and kurtosis.  (Note: We define kurtosis
          as 0 for a standard normal distribution; 3 has already been
          subtracted from the normalized 4th central moment.)

          See also the zscore command, which uses these stats to
          zscore the current trait during maximization.  The zscore
          procedure uses stats with the -out option.

          If there are multiple phenotypes files, joinfiles will be
          used to create a joined file in the working directory named
          joinfiles.stats.[pid].[uname -n].out.  Non-unique fieldnames
          will be renamed following the rules of joinfiles.  Under most
          circumstances, this joined file will be deleted before
          stats returns.  To run through the entire contents (and names)
          in the joined file, use the "stats -all" command.

          Variables created by a "define" command will work with stats
          beginning with SOLAR version 4.2.2.  Note that such variables
          are not evaluated by the command "stats -all".

Return to command index

A1.202 stats_get


Purpose:  Retrieve statistics from list returned by stats

Usage:    stats_get <stats> <statistic> [<variable>]

          <stats>      list returned by stats procedure
          <statistic>  name of statistic desired (see below for list)
          <variable>   select this variable (default: <first>)

Example:  set stat [stats -q -return q1]
          set kurt [stats_get $stat kurtosis]
          set skew [stats_get $stat skewness]

Notes:    The following statistics are available:

          variable     name of variable
          count        number of individuals having this variable (sample)
          missing      number of individuals missing this variable
          mean         mean
          min          minimum value
          max          maximum value
          sd           standard deviation
          skewness     skewness
          kurtosis     kurtosis
          discrete     0 if quantitative, 1 if discrete, -1 if not coded
                         properly
          alpha        0 if valid numbers; 1 if alphanumeric

          Of course, if a variable is selected, that variable must have
          been included in the stats list.  When running the stats command
          you may select any number of variables or use the -all option.
          See the stats command for further information.

Return to command index

A1.203 stepfor


Purpose: Foward stepwise covariate screening

Usage:   stepfor [-list listfile] [-list list] [-verbose] [-v]
                 [-fix listfile]  [-fix fixlist] [-max maxdim]
                 [-p pvalue] [-test othertest] [-par] [-parclean]

         stepclean     ;# Remove fully_typed covariate and unload file

         By default, stepfor will test all covariates in the current
         model, testing them all and then fixing the best one, and then
         repeating the process until the best one does not meet the
         default pvalue of 0.05, or user specified p-value or test (see
         below).  The final model will contain all the covariates which met
         the screening test.  A file named stepfor.out is written to the
         output directory with all the loglikelihoods, and a file named
         stepfor.history is written with other information.  All of the
         best models for each number of covariates are saved as
         stepfor.null<i> where <i> is the number of tested covariates.

         To ensure that all models use the same sample, a new file named
         fully_typed.out is created in the output directory which
         defines a variable named "fully_typed" for each fully typed
         individual.  This file is added to the list of open phenotypes
         files, and the variable "fully_typed" is added to the
         model as a "null" covariate which has no effect on the model
         other than restricting the sample to fully typed individuals.

         To remove the fully_typed covariate and unload the fully_typed.out
         phenotypes file, give the command "stepclean" after stepfor has
         completed.

         -list listfile    listfile is a file containing a list of all
                           covariates to be tested, one on each line.
                           The filename cannot contain spaces.  These
                           covariates may or may not be in the model when
                           the command is given.  If the -list option is
                           specified, all other covariates in the starting
                           model are automatically fixed.
         -list list        Alternatively, a Tcl list of covariates to
                           be tested can be specified.  Tcl lists are
                           space delimited and enclosed in quotes or curly
                           braces.

         -fix list         list is a Tcl list of covariates to be
                           included in every model and not tested.  Their
                           values will be estimated by maximum likelihood
                           for every model, unless you constrain them.
                           These covariates may or may not in the model
                           when the command is given.  For -fix, a list
                           could be simply one phenotype, and that
                           supercedes a file with the same name.
         -fix listfile     Alternatively, a file containing a list of all
                           covariates to be included in every model may
                           be specified.  The filename cannot contain
                           spaces.  The list of covariates to be fixed
                           will supercede the list of covariates to be
                           tested if the same covariate occurs on both
                           lists, however a warning will be given.

          -p pvalue        pvalue is the highest p value allowed for
                           a covariate to be included.  The default is 0.05.

          -max maxdim      maxdim is the maximum number of test covariates
                           to be included in a model (the maximum dimension).

          -verbose         Show maximization output during maximizations.
          -v               Same as -verbose

         -par              New and EXPERIMENTAL!  This option turns on Parallel
                           processing on the SFBR GCC Compute Ranch.
                           WARNING!  Do not run more than one instance of
                           stepfor -par from the same working directory.
                           Parallel stepfor will use many (but not all!) ranch
                           machines, and access for other users and jobs may
                           be delayed due to gridware thrashing.  The usual
                           output is not printed to the terminal to save time
                           but numerous parallel status messages are printed
                           to help the developers make this program better.
                           The parallel operation is automatic and the
                           parallel status messages may be ignored by most
                           users most of the time unless there is no output
                           for more than fifteen minutes.  Note: If model
                           includes linkage element matrices loaded from
                           some mibddir, those matrices should be relocated
                           to the working directory, or specified with an
                           absolute pathname in the model file.  This is
                           because in parallel operation the model is loaded
                           not in the current working directory but in a
                           subdirectory of /tmp.

         -parclean         Normally, parallel stepfor cleans up after itself.
                           However, if it is necessary to force a shutdown
                           of a parallel stepfor, normal cleanup is not
                           done.  "stepfor -parclean" cleans up all the
                           junk stepfor files in /tmp directories on all
                           ranch machines.  This must be run on medusa.  Do
                           not run if you have any other running parallel
                           jobs (parallel stepfor, parallel bayesavg, or any
                           parallel job using "launch" or "doscript") as
                           their files may be deleted too.
                           See also "doranch" for other ranch cleanup
                           procedures.  Cleanup history is written to a file
                           named cleantmp.out.

          -test othertest  othertest is a user defined Tcl proc that judges
                           whether or not a covariate should be included.
                           The test model with the best covariate is loaded
                           at the time this procedure is called.  This
                           procedure takes two mandatory arguments (whether
                           they are needed by the procedure or not).

                           loglike0 nullmodelname

                           loglike0 is the loglikelihood of the null model
                           which does not contain the current test covariate.
                           nullmodelname is the pathname to the null model
                           itself.  The procedure may obtain the
                           loglikelihood of the current model with the
                           loglike command.  The default procedure looks
                           like this:

                       proc stepfortest {loglike0 nullmodel} {
                           set chisq [expr 2.0 * ([loglike] - $loglike0)]
                           if {$chisq >= 0} {
                             	 set pvalue [chi -number $chisq 1]
                           } else {
                         set pvalue 1
                           }
                           set pvalue [chi -number $chisq 1]
                           putsout stepfor.history "\n    *** p = $pvalue"
                           global SOLAR_stepfor_pvalue
                           if {$pvalue <= $SOLAR_stepfor_pvalue} {
                               return 1
                           }
                           return 0
                       }

                           Note that the default procedure does not use
                           the nullmodel argument, but it does use a
                           global variable that you will not have to use.
                           The global supports the -p argument.  The
                           procedure may load the nullmodel without
                           restoring the current model; that is handled
                           by the stepfor procedure itself.
         

Return to command index

A1.204 stepup


Purpose: Covariate screening by Step Up algorithm, useful for QTN analysis

Usage:   stepup [-list listfile] [-list list] [-verbose]
                [-fix listfile]  [-fix fixlist]
                [-cutoff cutoff] [-logn logn] [-finishlogn]
                [-symmetric] [-cornerdf df] [-par]
                [-parclean]

         stepup is an fast version of bayesavg and may be used in
         QTN analysis.

         By default, stepup will test all covariates in the current
         model one at a time, then add all the new covariate models
         within the BIC cutoff to the window.  Then the window models are
         subjected to another round of testing against all covariates,
         and the process repeats until no more models are added to the
         window.  Unlike bayesavg, this algorithm doesn't test all
         possible models, just those that are derived from those in
         the window.  When completed, it writes files named stepup.win
         and stepup.avg to the output directgory containing posterior
         probabilities for the window models and components.

         To ensure that all models use the same sample, a new file named
         fully_typed.out is created in the output directory which
         defines a variable named "fully_typed" for each fully typed
         individual.  This file is added to the list of open phenotypes
         files, and the variable "fully_typed" is added to the
         model as a "null" covariate which has no effect on the model
         other than restricting the sample to fully typed individuals.
         This covariate is removed from the final best model stepup.best,
         so you may get a different likelihood in subsequent maximization.

         Up to dimension 3, all models with BIC better than the null model
         are retained.  (This feature may be controlled with the -cornerdf
         option.)  Also, the default "strict" rule is only applied to
         remove apparently redundant higher dimensional models at the
         very end after all important dimensions have been scanned.

         -list listfile    listfile is a file containing a list of all
                           covariates to be tested, one on each line.
                           The filename cannot contain spaces.  These
                           covariates may or may not be in the model when
                           the command is given.  If the -list option is
                           specified, all other covariates in the starting
                           model are automatically fixed.
         -list list        Alternatively, a Tcl list of covariates to
                           be tested can be specified.  Tcl lists are
                           space delimited and enclosed in quotes or curly
                           braces.

         -fix list         list is a Tcl list of covariates to be
                           included in every model and not tested.  Their
                           values will be estimated by maximum likelihood
                           for every model, unless you constrain them.
                           These covariates may or may not in the model
                           when the command is given.  For -fix, a list
                           could be simply one phenotype, and that
                           supercedes a file with the same name.
         -fix listfile     Alternatively, a file containing a list of all
                           covariates to be included in every model may
                           be specified.  The filename cannot contain
                           spaces.  The list of covariates to be fixed
                           will supercede the list of covariates to be
                           tested if the same covariate occurs on both
                           lists, however a warning will be given.

         -cutoff cutoff    Set the final BIC cutoff.  The default is 6.

         -logn logn        Use this fixed value for log(N) from the
                           beginning.

         -finishlogn logn  Recompute results of previous analysis with this
                           log(N) value.  Sometimes stepup fails at the end
                           because the standard error of the SE parameter
                           of the best BIC model cannot be computed, and
                           that is needed to compute the final log(N).
                           This option allows you to finish such a run that
                           nearly completed previously.  Be sure that
                           starting conditions (such as loaded pedigree,
                           phenotypes, model, outdir) and options are
                           exactly the same as before.  The original startup
                           (stepup.orig.mod) and null models from the output
                           directory will be loaded.  Note that the temporary
                           log(N) used by stepup by default is simply the
                           log of the sample size, and this is reported
                           to the stepup.history file.  You may choose to
                           use that or some other estimate.  A special file
                           required is stepup.winmods.prelim, which was
                           produced by the previous incompleted run of
                           stepup.

         -verbose          Show maximization output during maximizations.
         -v                Same as -verbose

         -cornerdf df      EXPERIMENTAL.  This sets the last degree of
                           freedom that uses a loose test to include models
                           in the window.  Models need only have a better
                           BIC than the null model up to and including
                           this df.  The default is 3.

         -symmetric        Apply symmetric rule rather than strict.  This
                           results in a larger window.

         -par              This option turns on Parallel
                           processing on the SFBR GCC Compute Ranch.
                           WARNING!  Do not run more than one instance of
                           stepup -par from the same working directory.
                           Parallel stepup will use many (but not all!) ranch
                           machines, and access for other users and jobs may
                           be delayed due to gridware thrashing.  The usual
                           output is not printed to the terminal to save time
                           but numerous parallel status messages are printed
                           to help the developers make this program better.
                           The parallel operation is automatic and the
                           parallel status messages may be ignored by most
                           users most of the time unless there is no output
                           for more than fifteen minutes.  Note: If model
                           includes linkage element matrices loaded from
                           some mibddir, those matrices should be relocated
                           to the working directory, or specified with an
                           absolute pathname in the model file.  This is
                           because in parallel operation the model is loaded
                           not in the current working directory but in a
                           subdirectory of /tmp.

         -parclean         Normally, parallel stepup cleans up after itself.
                           However, if it is necessary to force a shutdown
                           of a parallel stepup, normal cleanup is not
                           done.  "stepup -parclean" cleans up all the
                           junk stepup files in /tmp directories on all
                           ranch machines.  This must be run on medusa.  Do
                           not run if you have any other running parallel
                           jobs (parallel stepup, parallel bayesavg, or any
                           parallel job using "launch" or "doscript") as
                           their files may be deleted too.
                           See also "doranch" for other ranch cleanup
                           procedures.  Cleanup history is written to a file
                           named cleantmp.out.
                           

Return to command index

A1.205 stopclock


Purpose: simple local timer (elapsed time)

Usage: startclock;<timed command>;stopclock

See Also: timediff

Return to command index

A1.206 string_imatch


Purpose:  Case insensitive string match testing

Usage:    string_imatch <string1> <string2>

          Returns 1 for case insensitive match, 0 otherwise.

Note:     Useful in SOLAR scripts.

Return to command index

A1.207 stringplot


Purpose:  String plot of entire genome scan

Usage:    multipoint
          stringplot [-pass pass] [-allpass] [-title] [-lod <lod>] [-lodmark]
                     [-color <name>] [-noconv] [-date] [-nomark]
                     [-font <X-font-spec>] [-titlefont <X-font-spec>]
                     [-dash <dash spec>] [-linestyle <dash spec>]
                     [-mibddir <mibddir>]

Notes:    You can also use the command "plot -string" which has the
          same options and works identically.  For further information on
          the options, see "help plot", where all the options are
          described.  Here are the more important ones.  No options are
          usually needed, they are usually for fine-tuning the display.

          -pass       Multipoint oligogenic pass number, "1" is default
          -allpass    Plot all multipoint passes (in separate plots)
          -title      Title of plot
          -lod <lod>  Show LOD scale for lods this high (default is highest)
          -lodmark    Put marker ticks ON TOP of LOD curve (default is axis)
          -color      Takes standard names like "blue" and "red"
          -noconv     Do not mark convergence errors
          -date       Datestamp plot
          -nomark     Do not show marker ticks (very useful for GWAS)
          -font       X font for text (see xlsfonts | more)
          -titlefont  X font for title only
          -dash       Line style (see "help plot" for description of spec)
          -linestyle  Line style (same as -dash)
          -mibddir    specify mibddir (default is set with mibddir command)
          -mapfile    User mapfile
          -layers     Method of using multiple colors.  See help plot.

          mibddir and trait (or outdir) must have been specified previously.

          String plot graph will be both displayed on screen and written
          to file.  If you are running on a remote system, you will
          need to enable X window forwarding by setting DISPLAY variable
          to point back to X display, and enabling acceptance of X
          protocol with xhost + command, as described in section
          3.8.3.1 of the SOLAR documentation.  Sorry, there is no possible
          way to write the the file without displaying the plot, the
          underlying "tk/wish" program does not allow that.

          An encapsulated postscript file is written to the trait/outdir
          with the name str.passN.ps where N is the pass number,
          such as str.pass01.ps

          If a copy of the string plot script, which is named
          "stringplotk", is found in the current working directory, that
          will be used in place of the standard version in the
          SOLAR bin directory.  You can customize stringplotk as you
          wish.  (It is a "wish" script, after all.)  Good luck!

Shortcuts: stringp - stringplot

Return to command index

A1.208 stringsub


Purpose:  Simple verbatim string substitution (not regsub)

Usage:    stringsub <original> <target> <replacement>

Return to command index

A1.209 tablefile


Purpose:  Read data file in comma delimited or PEDSYS format

Usage:    set tablenum [tablefile open <filename>]
          tablefile $tablenum names               ; return field names
          tablefile $tablenum short_names         ; return short names
          tablefile $tablenum widths              ; return field widths
          tablefile $tablenum start_setup         ; start user record
          tablefile $tablenum setup <name>        ; add field to user record
          tablefile $tablenum get                 ; get user record
          tablefile $tablenum rewind              ; rewind file
          tablefile $tablenum close               ; close file
          tablefile $tablenum test_name <name>    ; test for named field
          tablefile $tablenum get_position        ; get current position
          tablefile $tablenum set_position <pos>  ; set position

Notes:    Intended for use in scripts.

          The get command will return data elements in a proper list.
          This means that if a data element includes spaces, it will be
          enclosed in braces.  For best results, data records should be
          read using lindex, which removes the braces.

          On End of File, get will return an empty list.  This should be
          tested for.  Other file errors will raise Tcl exceptions.

See Also: solarfile


Shortcuts: tabl - tablefile

Return to command index

A1.210 tclgr


Purpose:  Create xmgr session with pipe connection to SOLAR

  Note:   This is a low-level plot interface used by other commands.
          Most users will use the higher level interfaces such as
          'plot' or 'multipoint -plot.'

  Usage:  tclgr open                          ;# Start xmgr session
          tclgr send <xmgr command line>      ;# send command and wait now
          tclgr buffer <xmgr command line>    ;# add xmgr command to buffer
          tclgr flush                         ;# flush buffer of commands
          tclgr close                         ;# end xmgr  session
          tclgr syscommand <syscommand>       ;# Setup sys command for XMGR
                                              ;#   'xmgr' is the default
          <XMGR command line>               ;# defined in Ace/gr docs 

 The tclgr open command has a '-buffersize <number>' option.  The default
 buffersize is 1000.

 If the user closes the XMGR session remotely, the 'tclgr close' command
 must be used to officially close it before it can be re-opened.


Shortcuts: tclg - tclgr

Return to command index

A1.211 tdist


Purpose:  Set up t option for robust estimation of mean and variance

Usage:    tdist        set up t option
          tdist -off   turn off t option

Notes:    tdist creates a parameter t_param and sets tdist option
          tdist -off deletes t_param and unsets tdist option



Shortcuts: tdis - tdist

Return to command index

A1.212 timediff


Purpose:  Calculate seconds between two system time strings

Usage: timediff <start-time> <end-time>

See Also: startclock, stopclock

set starttime [exec date]
 ... procedure to be timed
set endtime [exec date]
return "seconds: [timediff $starttime $endtime]"

Return to command index

A1.213 toscript


Purpose:  Write previous commands to a script

Usage:    toscript [-ov] <name> [<first>[-<last>]]*

          -ov      Force overwrite of previous script
          <first>  First command number to be included
          <last>   Last command number in sequence to be included

Example:  toscript analysis 1 3 9-20  ;# include commands 1, 3, and 9-20

Notes:    Command numbers are displayed with the tcl "history" command.

          If no numbers are specified, all previous commands in this SOLAR
          session will be included in script.

          Script will be saved in file named <name>.tcl in the current
          directory.  After saving, newtcl will automatically be invoked
          so that the script can be used immediately.

          The script <name> defaults to being the first argument, but may
          also be the last argument if the <name> is not a number or range
          of numbers so there is no ambiguity.  For example:

          toscript 1-10 startscript     ;# OK
          toscript startscript 1-10     ;# OK
          toscript 2 1-10               ;# OK script named 2.tcl
          toscript 1-10 2               ;# OK script named 1-10.tcl

Return to command index

A1.214 trait


Purpose:  Select the trait (dependent variable)

Usage:    trait                               ; show current trait info
          trait <trait1>                      ; selects one trait
          trait [<traiti> ]+                  ; multivariate (up to 20)
          trait -noparm [<traiti> ]+          ; don't touch parameters at all

          [define <defname> = <expression>]+  ; Define any expressions as
          trait [<phenotype>|<defname> ]+     ; traits...see "help define"

Notes:    Solar is case insensitive to variable names.  Thus the command:

               trait foo

          will match a variable named FOO.  Variables can not be
          distinguished on the basis of case in Solar. A phenotypes file
          must be loaded before giving the trait command.

          Starting with SOLAR version 4.x, arbitary expressions including
          one or more phenotypes may be defined with the "define" command
          and then used as trait(s).  See "help define" for more details.

          If a model has already been created, it is recommended to give
          the "model new" command to clear it out prior to giving the trait
          command.  It is only reasonable to skip the "model new" if the
          new trait has similar parameter estimates to the previous trait.
          For a change of one trait to another, SOLAR will attempt
          to accomodate the change by adjusting parameter values (unless
          the -noparm option is used) as described below.  Any change of
          trait(s) involving two traits is not permitted (if any
          trait-specific parameters have been created); you will get an
          error message and the trait will go into a special error state
          ("Must_give_command_model_new") which will require you to give
          the "model new" command to clear before any model can be
          maximized (however, the model can be examined and saved in this
          state...you may wish to repair it offline in a text editor).
          Under no circumstances will the trait command create new
          parameters or delete old parameters.  Normally the "polygenic"
          command is given to create and test the standard variance
          component parameters.

          If changing from one trait to another, the Mean and SD parameters,
          if present, will be reset to zero to force setting starting values
          and boundaries during the next maximization.  Covariate betas
          will boundaries will also be zeroed.

Examples:

          trait bmi
          trait q1 q2

          define a = 10 * log(q4)
          trait a q3


Shortcuts: trai - traits

Return to command index

A1.215 transpose


Purpose:: transpose on MathMatrix or comma delimited file

Usage: transpose <MathMatrix>  ;# returns id of transposed MathMatrix
       transpose <infile> <outfile>  ;# transposes CSV file

Note: All records must have same length.  First record is treated like all
others.  To invert Pedsys file, use ped2csv first.  Memory usage for
extremely large files (>100mb) could be a problem.  If memory is exhausted
while caching the file in memory, solar might crash to the shell prompt.

Return to command index

A1.216 twopoint


Purpose:  Perform "Twopoint" analysis on directory of ibd files

Usage:    twopoint [-append] [-overwrite] [-grid] [-cparm {[<parameter>]*}]
                   -saveall

          -overwrite  (or -ov) Overwrite existing twopoint.out file.

          -append     (or -a)  Append to existing twopoint.out file.

          -cparm {}     Custom parameters.  Scanning will consist of
                        replacing one matrix with another matrix, everything
                        else is unchanged.  The starting model MUST be
                        a "prototype" linkage model with the desired
                        parameters, omega, and constraints.  Starting
                        points and boundaries for the parameters must be
                        explicitly specified.  Following the -cparm tag,
                        there must be a list of parameters in curly braces
                        that you want printed out for each model.  The
                        list can be empty as is indicated with a pair of
                        curly braces {}.  There must be a model named null0
                        in the maximization output directory for LOD
                        computation purposes.  The matrix to be replaced
                        must have name ibd or ibd1, ibd2, etc.  The highest
                        such ibd will be replaced.  If the matrix is loaded
                        with two "columns," such as "d7," each succeeding
                        matrix will be loaded with two columns also.
                        See section 9.4 for an example involving dominance.
                       
          -grid  Enables the "grid" option, which estimates recombination
                 fractions in the range theta=0 to 0.45, finding the
                 optimal value to the nearest 0.01.  (Note: this option is
                 not important for most twopoint users.  It also
                 increases evaluation time considerably.  Consider using
                 the separate "grid" command with only the markers of
                 greatest interest.)

          -saveall  Save all twopoint models in the maximization output
                    directory.  The models are named "ibd.<marker>".
Notes:
         The trait or outdir must be specified before running twopoint.

         There must be a null0.mod model in the trait or outdir
         directory.  This can be created with the polygenic command
         prior to running multipoint.  (This model may include
         household and covariate effects.  See the help for the
         polygenic command for more information.)

         An output file named twopoint.out will be created in the trait
         or outdir directory.  If that file already exists, the user must
         choose the -append or -overwrite option.

         The best twopoint model is saved as two.mod in the trait or outdir
         directory.  It is also loaded in memory at the completion of the
         twopoint command.

         IBDDIR should be set with the ibddir command prior to running
         twopoint.

         If models have two traits, the 2df LOD scores will be
         converted to 1df effective LOD scores, with the assumption
         that parameter RhoQ1 is not intentionally constrained.
         To override this, use the lodp command (see).  This feature
         was first included with beta version 2.0.1.
             

Shortcuts: twop - twopoint

Return to command index

A1.217 upgrade


Purpose:  Upgrade model files and scripts

Usage:    upgrade modelname
          upgrade scriptname.tcl

Notes:   If successful, the new file will replace the original one.
            The old file is saved with ".old" tacked on to the end of
            the name (e.g. amodel.mod.old).

         If an error is reported, the original file remains unchanged.

         If the file is a model, the ".mod" extension is assumed even if
            not specified.  Solar always tacks on ".mod" to the end of
            model filenames.

         If the file is a script, it must end with the ".tcl" extension,
            and the extension must be specified in the command as shown.
            Upgrade looks for this, and if found it assumes that a script
            is being upgraded.

         solartcl appears at the top of all upgraded script files.  SOLAR
            programmers are encoured to use solartcl as well.



Shortcuts: upg - upgrade

Return to command index

A1.218 usage


Purpose:  Print short "usage" message about a command

Usage:    usage <command>

Example:  usage multipoint    ;# shows usage of multipoint command

Notes:  Since this is printed directly to terminal, it will stay visible for
        next command.

        If help message contains no "Usage" section, the first 15 lines will
        be printed.


Shortcuts: usag - usages

Return to command index

A1.219 usort


Purpose:  Define unix sort program name 
          (used for multipoint*.out files)

Usage: usort <program>         ; use program
       usort ""                ; disables sort feature}
       usort                   ; show current program

Notes: The default is /usr/bin/sort, which should work on most system.
       It is necessary to include a path for users which have PEDSYS,
       which has its own program named "sort."  The program must be
       compatible with unix sort and have -n -o and -k arguments.

Example:  usort /usr/local/bin/sort


Shortcuts: usor - usort

Return to command index

A1.220 vcfinfo


Purpose: extract per-genotype meta information from vcf file

Usage: vcfinfo <filename> [-chrom <chrom>] [-info] [-all]

       -chrom  restrict output to chromosome <chrom>
       -info   include INFO and FORMAT fields
       -all    include everything, including sample genotypes
               (WARNING!  OUTPUT CAN BE VERY LARGE!)

See also vcfselect to extract genotype/sample data only

Notes: (1) output file is named as input file with .gz and .vcf extensions
       removed and .vcfinfo.csv or .<chrom>.vcfinfo.csv appended

       (2) vcf files permit commas within fields, which is not permitted
           in csv files.  vcfinfo converts these commas to spaces.

Return to command index

A1.221 vcfselect


Purpose: extract genotype data from vcf file (HIGHLY EXPERIMENTAL!)

Usage: vcfselect <filename> <chrom> <pos>

See also vcfinfo to extract genotype meta information.

Notes: (1) output file is named as input file with .gz and .vcf extensions
       removed and .<chrom>.<pos>.csv added.

       (2) File is returned with genotypes as originally coded, and also
           with 0,1,2 SOLAR coding if there are no blanks.  Fields are
           named sample_id, <chrom>.<pos> and snp_<chrom>.<pos>.

       (3) Errors in decoding to 0,1,2 are flagged with "error" in the
           output lines, and an error message.  However the output file
           is still written completely with the original genotype coding.

Return to command index

A1.222 verbosity


Purpose:  Set the output verbosity.

Usage:    verbosity default          ;  Set default verbosity
          verbosity plus             ;  Set above average verbosity
          verbosity min              ;  Set minimum verbosity
          verbosity max              ;  Set maximum verbosity (debugging)
          verbosity                  ;  Display current verbosity

          verbosity -number          ;  Show verbosity hex bit-coded number
          verbosity <number>         ;  Set verbosity with bit-coded number

 Notes:   During analysis scripts such as polygenic and multipoint, the
          default verbosity supresses all the usual maximization output
          (such as you would see with the 'maximize' command run by
          itself).

          The maximization output can be turned on for analysis scripts
          using the 'plus' verbosity level.  'plus' is above default,
          but below 'max.'

          The bit coded numbers used for various verbosity levels are
          subject to change.  User scripts should use the name (such
          as 'default') to be consistent with future releases.

          There are now a few reports which are so verbose that they
          are not even included in verbosity max.  They may be specified
          by using hex coded numbers (starting with "0x").  These are
          subject to change in future releases.

              0x4ffff    Max verbosity with per-pedigree likelihoods for
                         each interation.

          Hex-coded verbosity numbers were not supported prior to version
          2.0.2.


Shortcuts: verb - verbosity

Return to command index

A1.223 voxel


Purpose:  To set and save current voxel position

Usage:    voxel [<voxel-value>]

          <voxel-value> is 3 coordinates delimited by colons as x:y:z
             for example, 12:8:23

          If no voxel-value is specified, the current voxel is returned.
          If no current voxel has been defined, an error is raised.
          If a voxel has been defined, it is written to model files.
          The current voxel can also be set with the mask command, and
            that is the general way it should be done.

Notes:

1) Image traits are qualified as to type in the header of the phenotypes
   file, which must be comma-delimited type, and the type qualification
   follows the trait name separated by colon. The only type currently
   supported is NIFTI/RicVolumeSet which is called "nifti".  Such a
   header could look like this:

      ID,age,count:nifti

   Then, each corresponding data field consists of a NIFTI filename
   followed by a colon and the volume number for that person.  For example:

      A001,19,images.gz:1


Return to command index

A1.224 whoranch


Purpose: Show how many ranch machines each ranch user is using

Usage:   whoranch

Notes:   See also howmanyranch.  Requires use of Grid Engine software.

Return to command index

A1.225 write_subject_ids


Purpose: Writes out the subject IDs with non zero values given a trait or traits selected, 
phenotype loaded, and pedigree loaded.  The option to not include the pedigree can be
selected using the -no_ped argument.  

Usage: write_subject_ids -out <output filename> [optional: -no_ped ]

Example:
load ped ped.csv
load pheno phenotype.csv
trait trait_one
write_subject_ids -out trait_one_subject_ids

Return to command index

A1.226 zs


Purpose:  The old zscore command to zscore current trait
         
Old Usage:    zscore [-off] [-q]
              zs     [-off]       ;# Perform zscore quietly

              -off                Turn off zscore
              -q                  Perform zscore quietly

Notes:    The "Mean" and "SD" values used by zscore are computed only
          once, at the time the zscore command is given.  Thus they do
          not reflect later changes to the phenotypes file, or to the
          sample, which might be restricted due to individuals missing
          covariates added later.  Generally, for this reason the
          zscore command should be given after the covariates command
          and immediately before a model maximizing command such as
          polygenic.

          Starting with SOLAR Version 4.0.9, the trait mean and SD
          are computed from the actual sample that would be included
          in an analysis (at the time the zscore command is given).

          As described in the notes below, you can adjust the Mean
          and SD by using "option zmean1" and "option zsd1" to set
          the values actually used.  These values are applied to
          the trait values during maximization.

          If the trait is changed without giving the "model new"
          command, the new trait will be zscored automatically.
          This feature is obsolescent.  In a future update, zscore
          will be turned off when the trait is changed.

          An alternative to zscore is to define the trait as the
          inverse normal transformation of a variable.  See
          "help inormal" and "help define" for further details.
          
          zscore will also calculate a number of statistics
          for the trait: mean, minimum, maximum, standard deviation,
          skewness, and kurtosis.  These will be written to the file
          zscore.out in the current output directory.  As of version
          4.0.9, these statistics are no longer written to the terminal.
          Instead, a single line is displayed with the trait name,
          mean, and SD.  Even that line is not shown if zscore is
          invoked from a script or the zs abbreviation of the command
          is used.

          To calculate these statistics for any phenotypic variable without
          zscoring and without necessarily making it the trait, use the
          "stats" command instead.

          A trait must already have been selected with the trait command
          or loaded model.  Also the phenotypes file must have been loaded.

          When a maximization is performed, trait values are replaced with
          their zscored values.  The formula is:

          zscored = (value - Mean) / SD

          zscore is a model dependent option controlled by "option zscore".
          It remains in effect until another model is loaded or the
          "model new" command is given.  When models maximized with zscore
          are reloaded, zscore is again activated.

          "option zscore" is set to 1 ("on") by this command, and the
          related options zmean1 and zsd1 (mean and standard deviation
          for the first trait) and zmean2 and zsd2 (mean and standard
          deviation for the second trait) are set as required.  You can
          adjust these options directly to fine tune the mean and standard
          deviation values used, but be sure that zscore is not set to 1
          until the mean and (non-zero !) standard deviation values are
          set for all traits in the model.

          In a multivariate model, zscore will only be applied to the
          first two traits.

          Whenever zscore is activated or deactivated, parameters mean
          and SD are reset to zero to force setting new boundaries and
          starting point during the next maximization.

          If a new phenotypes file is loaded, the zscore command should be
          repeated to reflect the new file.

Return to command index

A1.227 zscore


Purpose:  Zscore current trait(s) or covariate(s)

Usage:    define defname = zscore_phenotype 
          trait defname
          OR
          covariate defname

          (defname is any user defined name, phenotype is any phenotype name)

Notes:    zscore_ is a prefix that may be used in the define command,
          similar to the inormal_ prefix.  Once a definition has been
          created, it may be used in either the trait or covariate commands.
          For further information, see "help define".

          The Mean and SD are obtained from the current maximization sample,
          not the entire phenotypes file.

          In versions of SOLAR prior to 4.4.0, zscore was a command that
          could be only used to zscore the current trait.  That command
          is still available as before, but was considered obsolescent.
          It was difficult and problemantical.  For information about that
          command, for understanding previous uses, see "help old_zscore".


Return to command index