From Frank Harrell's graph course notes
Arguments for tweaking histSpike
Arguments for tweaking histSpike
histSpike(x, side=1, nint=100, frac=.05, minf=NULL, multwidth=1,
type=c('proportion','count','density'),
xlim=range(x), ylim=c(0,max(f)), xlab=deparse(substitute(x)),
ylab=switch(type,proportion='Proportion',
count ='Frequency',
density ='Density'),
y=NULL, curve=NULL, add=FALSE,
bottom.align=type=='density', col=par('col'), lwd=par('lwd'),
grid=FALSE, ...)
- # frac: determine the relative length of the whole plot that is used to represent the maximum frequency.
- # Side: axis side to use (1=bottom (default for histSpike), 2=left, 3=top (default for scat1d), 4=right)
- # add=TRUE: tacks on future graphs to the plot
- # tfrac: fraction of tick mark to actually draw. If tfrac<1, will draw a random fraction tfrac of the line segment at each point. This is useful for very large samples or ones with some very dense points. The default value is 1 if the number of non-missing observations n is less than 125, and max(.1, 125/n) otherwise. # Hmm... second part is somewhat dense. # eps: fraction of axis for determining overlapping points in x. For preserve=TRUE the default is 0 and original unique values are retained, bigger values of eps tends to bias observations from dense to sparse regions, but ranks are still preserved. # lwd: line width for tick marks, passed to segments
- # col:color for tick marks, passed to segments
- # y: specify a vector the same length as x to draw tick marks along a curve instead of by one of the axes. The y values are often predicted values from a model. The side argument is ignored when y is given. If the curve is already represented as a table look-up, you may specify it using the curve argument instead. y may be a scalar to use a constant vertical placement. #huh?
- # curve: a list containing elements x and y for which linear interpolation is used to derive y values corresponding to values of x. This results in tick marks being drawn along the curve. For histSpike, interpolated y values are derived for bin midpoints.
- # bottom.align: set to TRUE to have the bottoms of tick marks (for side=1 or side=3) aligned at the y-coordinate. The default behavior is to center the tick marks. For datadensity.data.frame, bottom.align defaults to TRUE if nint>1. In other words, if you are only labeling the first and last axis tick mark, the scat1d tick marks are centered on the variable's axis.
- # type: used by or passed to histSpike. Set to "count" to display frequency counts rather than relative frequencies, or "density" to display a kernel density estimate computed using the density function.
- # grid: set to TRUE if the R grid package is in effect for the current plot
- # nint: number of intervals to divide each continuous variable's axis for datadensity. For histSpike, is the number of equal-width intervals for which to bin x, and if instead nint is a character string (e.g., nint="all"), the frequency tabulation is done with no binning. In other words, frequencies for all unique values of x are derived and plotted.
- # presorted: set to TRUE to prevent from sorting for determining the order l
General form of the first argument for creating graphs:
vertical variable ~ horizontal variable | row.conditioner * column.conditioner * page.conditioner, groups=superposition.variable)
# groups makes separate lines or symbols within a panel.
Plotting commmands in R
- contour # contour plot
- coplot # separate plots of different ranges
- ecdf # empirical distribution function plot (Hmisc)
- faces # Chernoff faces for multivariate data # What is this?!
- nomogram # nomograms (Design)
- persp # 3-D perspective plots of grids
- pie # pie charts
- plclust # plots of cluster trees from hclust
- plot.Design # family of functions for fitted objects
- plsmo # plot smoothed nonparametric estimates (Hmisc)
- scat1d # add data density (rug plot) to plot (Hmisc enhancement of rug)
- survplot # survival plots (Design)
- tsplot # time series plots
Interesting ones for the dissertation:
- usa #map of the US #Location of other studies of Black/White populations
- # doesn't really work if typed on the console
- symbol.freq # diagram of frequency table (Hmisc)
- qqnorm # normal probability plot
- qqplot # quantile-quantile plot
- plot.summary.Design #plots effect ratios and CIs (Design)
- plot.summary.formula # plotting functions for summary.formula function (Hmisc)
- plot # scatterplot or line plot
- plot.anova.Design # Dot chart of anova table (Design)
- pairs # all possible pairs of scatterplots
- hist # histogram
- hist.data.frame # histogram of all variables in a data frame (Hmisc)
- histSpike # high–resolution “spike” histograms and density plots
- labcurve # draw and label curves or label existing curves (Hmisc)
- datadensity # multivariable version of Hmisc’s scat1d
- # displays data density for all variables in a data frame
- dotchart # displays values based on position of dots
- barplot # vertical or horizontal bar graph
- bpplot # box–percentile plots (Hmisc)
- boxplot # side-by-side boxplots
From Frank Harrell's hmisc library for R.
histSpike: Add high-resolution spike histograms or density estimates to an existing plot
plot (density(x), type= 'l')
density plot( x) #Trellis/Lattice version
hist(x , probability=T , nclass =20 ) ;
lines(density(x)) #ditto
# probability=T scales y-axes so area under curve is 1.
Adding titles
plot(x, y, main="Main \ntitle", sub='Subtitle', adj=0)
# \n jumps one line down on the output, rather like perl.
# adj=0 Left justification
# adj=0.5 center justification
# adj=1 right justification
par(mfrow=c(2,2), oma=c(0,0,2,0))
# A 2X2 matrix of plots
# leave 2 lines for overall top title (oma is outer margins). Puts title two lines below the edge of the graph)
mtitle ('Overall title')
# A title for several graphs together
pstamp()
# date and time stamp on the lower right
Lines and Symbols
plot(x, y)
axis(3)
# add axis (ticks & labels)
axis(3, labels=FALSE)
# axis on the right and ticks only
lines (1:3, c(2,4,-1)
# add x=1:3, y=2, 4, -1 : could be useful for drawing a line at OR=1 to specify the null in the OR graphs.
points(locator())
# add clicked points
text(.2. 1.3, 'Text')
# add text
text (locator(1), "Mytext")
# add text at click
Reference Lines
abline (a=0, b=1)
# line of identity (a, b=intercept, slope)
abline (a=0, b=1, lty=2)
# dotted line, linetypes are specified with the lty option; Could maybe use this for the 95%CI lines
abline (h=c(1, 3))
# horizontal line at y=1, 3
abline(v=0)
vertical line at x=0
Interaction:
Could be shown as scatterplots or dotplots for different groups.
How is one or more categorical variable related to a single continuous numeric response variable.
The Dotplot function or the dotchart2 function in Hmisc
summary.formula creates it.
Scatterplot matrices
Show all pairwise relationships from among 3 or more continuous variables.
pairs(dataframe[, exposures])
----------------
- Multiple graphs on a common scale:
- Group all variables with age on the x-axis together
- Group all variables with span in months together
- Sort by order of values attached to categories (improves accuracy of perception). But this is not necessarily true when the order of categories is important.
- Grouping is necessary for some tables but not for graphs --> Kernel density distribution is a better representation of distribution.
- Minimize the use of remote legends. Curves can be labeled at points of maximum separation (see the Hmisc labcurve function).
- Notations and Symbols: As consistent as possible with the other parts of the document.
- Effective Coding Scheme for two lines: Thin Black Line and Thick gray scale line. (Possibly for the OR and CI bounds)
- Single categorical Variable: Use a dot plot or horizontal bar chart to show the proportion corresponding to each category. Second choices for values are percentages and frequencies. The total sample size and number of missing values should be displayed somewhere on the page. If there are many categories and they are not naturally ordered, you may want to order them by the relative frequency to help the reader estimate values.
For Specific Aims.
Page 40: Odds ratio graphs
Page 41: Trend of Odds (Also do this for categorical.)
Page 42: Arrange according to contribution of variables.
0 Comments:
Post a Comment