Online Help   Data Menu Distribution Menu


Visualize Menu

The visualize menu contains entries for nonparametric estimation procedures. The options differ in the univariate and multivariate
modes. The following options are provided by the Visualize Menu:

Univariate Mode: Kernel Density Histogram
Scatterplot (2D) Boxplot
Clusterplot Sample QF
Sample DF Sample Mean Excess
Sample Median Excess Sample Hazard Function
Sample Autocov. Function Sample Autocorr. Function
Sample Path
Multivariate Mode: Scatterplot (3D) Scatterplot (2D)
Kernel Density Sample DF
Sample SF Sample C. Dep. F.
Boxplot Surface Plot
Contour Plot



Univariate Mode

Kernel Density (univariate mode)

This option plots a kernel density with a given bandwidth for the active data set. Kernel densities can be applied to data sets of type Xtremes Univariate Data and Xtremes Censored Data. The dialog box provides the following options:
Bandwidth
Enter a bandwidth for kernel density estimation.
Automatic selection
Mark this field so that Xtremes computes an appropriate bandwidth for the active data set.
Kernel
Select a kernel for the estimation. There are four different kernels available. Notice that the 3rd and 4th kernel also attain negative values (like every probability kernel which is orthogonal to x**2).
Bounded
Select one of the four buttons to specify boundaries of the kernel density (determined by the sample minimum and maximum). For example, select Nowhere to get a kernel density, where the support is not restricted in advance.

Histogram

This option plots a histogram of the active data set and can be applied to Xtremes Grouped Data and Xtremes Discrete Data.

Scatterplot (2D)

For Xtremes Time Series and Xtremes Multivariate Data, a scatterplot of points ( i, y[i] ) or ( x[i], y[i] ) becomes available in a special plot window which is a plain plot window equipped with the additional facility to plot and to handle a scatterplot. Different data sets are represented in different scatterplot windows. First select two components of the active data set in the case of Xtremes Multivariate Data.
A related facility is
Clusterplot in the Visualize Menue.

Time series facilities (scatterplot)

Plot a Least Squares Polynomial in the scatterplot window (based on the active data) or execute Moving Average, Lowess to obtain moving averages and residuals - where the latter are still represented by a scatterplot - in separate scatterplot windows. Another option is Seasonal Component.

Cutting facilities (scatterplot)

Points below a threshold can be made inactive by using the Threshold option in the local menu of the scatterplot window (rightclick in the window) or by using the (scissors) point selection mouse mode. The inactive points are marked in a green color. The active points can be saved to a file by executing the Save Actual Points option.

Conversion of a scatterplot into a curve

The full plot options (option mouse mode) become applicable by copying the scatterplot into the clipboard using the moving mouse mode. After this operation, the scatterplot - with connected points - is dealt with as a curve. Afterwards, drag or copy the "scatterplot curve" to another plot or scatterplot window according to your choice.

Boxplot

A boxplot for Xtremes Univariate Data x[1],...,x[n] consists of The boxplot option can be applied as well to Xtremes Multivariate Data. The boxplots for the single univariate data sets (in the different columns) are plotted at the positions 1,2,3, .... An exception is made if there are exclusively real numbers in the column headers. In that case, these real numbers determine the plotting positions.
We remark that each boxplot is dealt with in the same manner as a single function in a plotting window.

Boxplots are not dealt with in Statistical Analysis because they are adjusted to normal data. The data plotted outside of the interval I are called outliers (indicating that the normal modeling may not be correct).

Clusterplot

Let the active data set be of type Xtremes Time Series. Denote by k the number of exceedances over a threshold u and denote by m(k) the number of clusters. The sample mean cluster size (relative to u or k) is given by

mcsize(k) := k/m(k), k=1,...,n.


One may execute Visualize ... Clusterplot to obtain a
scatterplot of (1/n, x[1]), ..., (1, x[n]) and, additionally, plots of (k, mcsize(k)) and (k, 1/mcsize(k)) for k=1,...,n in the graphics windows Sample Mean Cluster Size and Reciprocal Sample Mean Cluster Size. Let u and, thus, k be fixed. Denote by |m| the cluster size of a cluster m of exceedance times. Then

P[k]({x}) := |{m : |m| = x}| / m(k), x = 1,...,k

defines the sample cluster size distribution P[k]. Notice that mcsize(k) is the mean of the distribution P[k]. Creating a threshold in the Clusterplot window (SHIFT+leftclick) also opens a window Sample Cluster Size Distribution displaying P[k]({x}) for x=1,...,k by means of a histogram.
For more information, see Cluster Options.

Sample QF

This option plots the sample qf for the active data set which must be of type Xtremes Univariate Data, Xtremes Censored Data or Xtremes Grouped Data. The empirical qf for grouped data is the qf pertaining to the histogram.

Sample DF

This option plots the sample df for the active data set which must be of type Xtremes Univariate Data, Xtremes Censored Data or Xtremes Grouped Data. The empirical df for grouped data is the df pertaining to the histogram.

Sample Mean Excess

This option plots the empirical mean excess function for the active data set which must be of type Xtremes Univariate Data or Xtremes Grouped Data.

Options of dialog box:
Choose trimming parameter p
Enter a trimming parameter p for the trimmed mean excess function.

Sample Median Excess

This option plots the sample median excess function for the active data set which must be of type Xtremes Univariate Data or Xtremes Grouped Data. If X has the df F, the median excess function is the median of the conditional distribution F( · |u) of X - u given X > u,

m(u,F) = F**(-1)(1/2,u) ,

with F**(-1) being the qf of the df F.
We obtain the sample median excess function as m( · ,F[n]). For our computations we evaluate

m( x[n-k+1:n], F[n] ), 6 <= k <= n,

and take a linear interpolation thereof. The sample median excess function for grouped data is the excess function pertaining to the histogram.

Sample Hazard Function

For Xtremes Univariate Data, take the kernel hazard function as an estimator of the hazard function h[F], where f[n,b] is the kernel density for the Epanechnikov kernel. Xtremes asks you to enter a bandwidth beta in the dialog box Enter bandwidth. The default value is 1. For Xtremes Grouped Data, again with frequencies n[j] in cells (t[j], t[j+1]), the sample hazard function is defined via the histogram.

Sample Autocovariance

It is assumed that the data come from a weakly stationary time series. Display the sample autocovariance function

r(h) = (1/n) Sigma (x[i]- mu) (x[i+h] - mu) ,

with mu denoting the sample mean.
It is advisable to detrend and deseasonalize the data using the options in the local menu of the
scatterplot window.

Sample Autocorrelation

Display the sample autocorrelation function

rho(h) = r(h) / r(0)

with r(h) being the
sample autocovariance function.

Sample Path

Plot the sample path of the active univariate data set, i.e. the sample df multiplied by the sample size.

Multivariate Mode

Scatterplot (3D)

Plot data points (x , y, z) in a three-dimensional coordinate system. For plotting, one must select three variables of the multivariate data set. This option is applicable to Xtremes Multivariate Data.

Kernel Density (multivariate)

Let k be a univariate kernel (see univariate kernel density). Then, u(x) = k(x[1]) ... k(x[d]) defines a d-variate kernel.

For visualization, only bivariate Gaussian densities are utilized. Bandwidth and direction of the kernel may be adjusted by using the parameter varying mouse mode.

Sample DF (multivariate)

Plot the sample df for the given multivariate data set. One must select two components first.

Sample SF (multivariate)

Plot the sample survivor function (sf) for the given multivariate data set. One must select two components first.

Sample Canonical Dependence Function

Plot the sample canonical dependence function for the given multivariate data set. One must select two components first.
For details see Statistical Analysis, (10.17) and the comments in Chapters 9 and 10.

Surface Plot

The Surface Plot option performs a surface plot of data stored in a multivariate data set, which must follow the structure described below. Given a multivariate data set with points
 x[i]   = x0 + (i-1)(n-1)(x1-x0),
 y[j]   = y0 + (j-1)(m-1)(y1-y0),
 z[i,j] = f( x[i], y[i] ),
1 <= i <= n, 1 <= j <= m, one can display a surface plot of the function obtained by an interpolation of the points (x[i], y[j], z[i,j]).
Xtremes detects if a lattice is defined by (x[i], y[j]), yet one must store the points ordered according to their x- and y-coordinates.

Example:
The following data set contains supporting points of the function f(x,y) = xy. Only the actual data are listed:
"x"   "y"   "z"
 1     1     1
 1     2     2
 1     3     3
 2     1     2
 2     2     4
 2     3     6
 3     1     3
 3     2     6
 3     3     9
...

Contour Plot

The Contour Plot option performs a contour plot of data stored in a multivariate data set, which must follow the structure described below. Given a multivariate data set with points
 x[i]   = x0 + (i-1)(n-1)(x1-x0),
 y[j]   = y0 + (j-1)(m-1)(y1-y0),
 z[i,j] = f( x[i], y[i] ),
1 <= i <= n, 1 <= j <= m, one can display a contour plot of the function obtained by an interpolation of the points (x[i], y[j], z[i,j]).
Xtremes detects if a lattice is defined by (x[i], y[j]), yet one must store the points ordered according to their x- and y-coordinates.

Example:
The following data set contains supporting points of the function f(x,y) = xy. Only the actual data are listed:
"x"   "y"   "z"
 1     1     1
 1     2     2
 1     3     3
 2     1     2
 2     2     4
 2     3     6
 3     1     3
 3     2     6
 3     3     9
...

© 2005
Xtremes Group · updated Jun 21, 2005