![]() |
![]() | Here We Go | ![]() | ![]() | Visualize Menu |
Data Menu
Menus for the chosen
domain
are listed in the menu bar. The default domain is D(iscrete) for
certain discrete distributions.
The data menu contains entries to generate, transform and convert
data sets. The parametric options depend on the selected
mode and domain. The following options are provided by the Data
Menu:
Read Data | Generate Univariate Data |
Generate Bivariate Data | Generate Multivariate Data |
Generate Time Series | Generate Counting/Point Process |
Transform Data | Convert to |
Choose Data | List Data |
Quit |
Read Data
Load a data set from a file by means of
the Read Data option.
Several data sets are stored in
the dat subdirectory. There are the following options
in the dialog box:
Generate Univariate Data
The following options enable the generation of a univariate data set.
Note that the distributions belong to
different domains.
Discrete domain: | Uniform | Binomial |
Poisson | Negative Binomial | |
SUM domain: | Gaussian | Gaussian-GCauchy |
Student Distributions | Non-central Student | |
Sum-Stable Distributions | ||
MAX domain: | Gumbel (EV 0) | Frechet (EV 1) |
Weibull (EV 2) | EV | |
POT domain: | Exponential (GP 0) | Pareto (GP 1) |
Beta (GP 2) | GP |
r | integer |
s | integer > r |
Samplesize | positive integer |
Filename | Select a filename, and, optionally, a directory. |
Binomial
Generate a data set according to a binomial B(n,p) distribution.
Options:
n | positive integer |
p | [ 0 , 1 ] |
Samplesize | positive integer |
Filename | Select a filename, and, optionally, a directory. |
Poisson
Generate a data set according to a Poisson P(lambda) distribution.
Options:
lambda | positive real |
Samplesize | positive integer |
Filename | Select a filename, and, optionally, a directory. |
r | parameter | nonnegative real |
p | parameter | ( 0 , 1 ) |
Samplesize | positive integer | |
Filename | Select filename, and, optionally, a directory. |
Gaussian
Generate a data set according to
a Gaussian distribution.
Options:
mu | location parameter | real |
sigma | scale parameter | positive real |
Filename | Select a filename, and, optionally, a directory. |
Gaussian-GCauchy
Generate a data set according to a mixture of a
Gaussian and and a GCauchy distribution.
Options:
mu | location parameter | real |
sigma | scale parameter (GCauchy) | positive real |
d | contamination parameter | [ 0 , 1 ] |
alpha | shape parameter | positive real |
sigma 1 | scale parameter (Gaussian) | positive real |
Filename | Select a filename, and, optionally, a directory. |
Student Distributions
Generate a data set according to
a Student distribution.
Options:
sigma | scale parameter | positive real |
alpha | shape parameter | positive real |
Filename | Select a filename, and, optionally, a directory. |
Non-central Student
Generate a data set according to a non-central Student distribution.
Sum-Stable Distributions
Generate a data set according to a sum-stable distribution.
Options:
alpha | shape parameter | [ 0 , 2 ] |
skewness | skewness | ( -1 , 1 ) |
mu | location parameter | real |
sigma | scale parameter | positive real |
Filename | Select a filename, and, optionally, a directory. |
Gumbel (EV 0)
Generate a data set according to a Gumbel distribution.
Options:
mu | location parameter | real |
sigma | scale paremeter | positive real |
Filename | Select a filename, and, optionally, a directory. |
Frechet (EV 1)
Generate a data set according to a Frechet (EV 1) distribution.
Options:
alpha | shape | positive real |
mu | location | real |
sigma | scale | positive real |
Filename | Select a filename, and, optionally, a directory. |
Weibull (EV 2)
Generate a data set according to a Weibull (EV 2) distribution.
Options:
alpha | shape | negative real |
mu | location | real |
sigma | scale | positive real |
Filename | Select a filename, and, optionally, a directory. |
gamma | shape parameter | real |
mu | location parameter | real |
sigma | scale parameter | positive real |
Filename | Select a filename, and, optionally, a directory. |
Exponential (GP 0)
Generate a data set according to an exponential distribution.
Options:
mu | location parameter | real |
sigma | scale parameter | positive real |
Filename | Select a filename, and, optionally, a directory. |
Pareto (GP 1)
Generate a data set according to
a Pareto distribution.
Options:
alpha | shape parameter | positive real |
mu | location parameter | real |
sigma | scale parameter | positive real |
Filename | Select a filename, and, optionally, a directory. |
Beta (GP 2)
Generate a data set according to a Beta (GP 2) distribution.
Options:
alpha | shape parameter | negative real |
mu | location parameter | real |
sigma | scale parameter | positive real |
Filename | Select a filename, and, optionally, a directory. |
gamma | shape parameter | real |
mu | location parameter | real |
sigma | scale parameter | positive real |
Filename | Select a filename, and, optionally, a directory. |
Animation for Discrete Data
Two windows open for plotting
Generate data by clicking on
The scatterplot and the sample histogram for the generated
data are displayed. The data set is saved to a file and
becomes the active one as soon as the
specified Samplesize is attained.
In addition, a dialog box opens with +1 and +20 buttons by
which one can also increase the number of generated data until
the selected Samplesize is attained.
Visualizing Continuous Data
Data are generated interactively. The underlying df is displayed in a
graphics window. Generate data by
The sample df for the given data is displayed. The data set is saved to
a file and becomes active as soon as the size specified in the dialog
box is attained.
Generate Bivariate Data
This submenu provides the generation of data from distributions
in
Bivariate EV models: | Gumbel-McFadden |
Marshall-Olkin | |
Huesler-Reiss |
It is only available in the
multivariate mode within the
MAX domain.
Gumbel-McFadden
Generate a bivariate data set according to the Gumbel-McFadden
distribution with univariate Weibull (EV 2) margins with shape
parameter alpha = -1, i.e. exponential distributions on the
negative half line.
Options:
mu1 | location parameter | real |
sigma1 | scale parameter | positive real |
mu2 | location parameter | real |
sigma2 | scale parameter | positive real |
lambda | dependence parameter | larger/equal to 1 |
Sample Size | positive integer | |
Filename | Select a filename, and, optionally, a directory. |
mu1 | location parameter | real |
sigma1 | scale parameter | positive real |
mu2 | location parameter | real |
sigma2 | scale parameter | positive real |
lambda | dependence parameter | [ 0 , 1 ] |
Sample Size | positive integer | |
Filename | Select a filename, and, optionally, a directory. |
mu1 | location parameter | real |
sigma1 | scale parameter | positive real |
mu2 | location parameter | real |
sigma2 | scale parameter | positive real |
lambda | correlation coefficient | [ -1 , 1 ] |
Sample Size | positive integer | |
Filename | Select a filename, and, optionally, a directory. |
Generate Multivariate Data
This option is only available in the
multivariate mode
within the SUM domain.
One can generate bi- and trivariate Gaussian
samples.
mu1 | location parameter | real |
sigma1 | scale parameter | positive real |
mu2 | location parameter | real |
sigma2 | scale parameter | positive real |
rho | correlation coefficient | [ -1 , 1 ] |
Sample Size | positive integer | |
Filename | Select filename, and, optionally, directory. |
Trivariate Gaussian
Generate a trivariate Gaussian-distributed data set.
Options:
Covariances | positive real | |
Location | location parameters | real |
Sample Size | positive integer | |
Filename | Select a filename, and, optionally, a directory. |
Bivariate Student
Trivariate Student
Generate Time Series
The following options are provided to generate time series data.
Gaussian AR(1)
Generate a data set according to a Gaussian AR(1) process.
Options:
mu | location parameter | real |
sigma | scale parameter | positive real |
d | correlation coefficient | [ 0 , 1 ] |
Sample Size | positive integer | |
Filename | Select a filename, and, optionally, a directory. |
Moving Average MA(q)
Generate a data set according to a Moving Average MA(q) process.
Options:
The stored data set is now the active one.
Sample Size positive integer
Filename Select a filename, and, optionally, a directory.
ARMA(p, q) Process
Generate a data set according to a Gaussian ARMA(p,q) process.
The simulation makes use of the innovation algorithm.
Options:
The stored data set is now the active one.
If the AR polynomial is unequal to
zero on the unit circle, then there is a causal ARMA process
(which has a representation as a moving average).
Sample Size positive integer
Filename Select a filename, and, optionally, a directory.
Generate Counting/Point Process
Let 0 <= T[1] <= T[2] <= T[3] <= ... denote the
arrival times of data X[1], X[2], X[3], ... . Up
to a time horizon T such arrival times are generated and stored
to a file as Xtremes Univariate Data.
In addition, the path of the pertaining
counting process N(t), t >= 0, representing the number of data
occurring up to time t, can be plotted by using the Visualizing
button. Alternatively, adopt
the option Path in the Visualize menu.
In the bivariate case the marks X[1], X[2], X[3], ... are
added resulting in
points (T[1],X[1]), (T[2],X[2]), (T[3],X[3]), ....
For further details, see Statistical Analysis, pages
197 - 200.
Poisson Process
Recollect the general remarks about arrival times
in Generate Counting/Point Process.
We start with the most simple case of a homogeneous Poisson process
on the positive half-line.
The first arrival process is
the homogeneous Poisson process with intensity lambda.
The interarrival times Y[i] = T[i] - T[i-1] are iid
exponential random variables with expectation 1/lambda.
The numbers N(t) are Poisson distributed with
parameter lambda t.
Parameters are:
lambda | intensity | positive real |
T | time horizon | nonnegative real |
Filename | filename |
Polya-Lundberg Process
Recollect the general remarks about arrival times
in Generate Counting/Point Process.
The second arrival process is
the Polya-Lundberg process which is a mixed Poisson process with a
parameter lambda drawn according to a gamma density with shape
parameter alpha and scale parameter sigma. Thus, the arrival
times are drawn according to the Poisson process with
intensity lambda.
alpha | shape | positive real |
sigma | scale | positive real |
Filename | filename |
Marked Poisson Process
Generate a time series with arrival times according to a
homogeneous Poisson process with intensity lambda (also
see Poisson Process) and
marks according to a generalized
Pareto (GP) distribution.
Transform Data
Data sets can be transformed by means of several predefined
operations. The resulting data set is of the same data type
as the original one. Choose the
option Convert to to convert a data set
to a different type.
Change Sign
The signs of the data of the active univariate or multivariate data set
are changed.
Enter a filename to store the transformed values in a separate file.
This option may also be applied to a time series. In this case, the second
component is transformed; the first one remains unchanged.
Affine Transformation
An affine transformation is applied to the active univariate or
multivariate data set, i.e. the transformation
f(x) = mu + sigma x
is done for each point of the data set. The transformed values
are written to a file.
This option may also be applied to a time series. In this case, the second
component is transformed; the first one remains unchanged.
Save Exceedances
This option is applicable for Xtremes Univariate Data, Xtremes Time
Series and Xtremes Multivariate Data. The values exceeding the
specified threshold are written to a new data set. In the multivariate case,
recall that a multivariate sample has a matrix form. One must
select one component (column) of the active data set. The lines
containing exceedances over the specified threshold in the selected
column are written to a new multivariate data set.
Save Blocks Maxima
Given a univariate data set x[1], ..., x[n] of size n, Xtremes
builds blocks x[1], ..., x[k]; x[k+1], ..., x[2k]; ... ; x[lk], ...,
x[n] with k denoting the Block size and l = [n/k] the number
of blocks. The maximum of each block is saved to a file. Enter
block size and filename in the pertaining edit fields. The
transformed data set is stored under the selected name.
Save Moving Maxima
Given a univariate data set x[1], ..., x[n], Xtremes builds
moving blocks x[1], ..., x[k]; x[2], ..., x[k+1], .... If the sample
size n is exceeded, blocks will be filled with values x[1], x[2],
... again. The maxima of all moving blocks are calculated and written
to a new data set. Enter block size and filename in the
pertaining edit fields. The
transformed data set is stored under the selected name.
Save Blocks Sums
Given a univariate data set x[1], ..., x[n], Xtremes builds blocks
x[1], ..., x[k], x[k+1], ..., x[2k], ..., x[lk], ..., x[n] with k
denoting the Block size and l the number of blocks. The sum of
each block is saved to a file. Enter block size and
filename in the pertaining edit fields. The
transformed data set is stored under the selected name.
Save Moving Sums
Given a univariate data set x[1], ..., x[n], Xtremes builds
moving blocks x[1], ..., x[k]; x[2], ..., x[k+1], .... If the sample
size n is exceeded, blocks will be filled with values x[1], x[2],
... again. The sums of all moving blocks are calculated and written
to a new data set. Enter block size and filename in the
pertaining edit fields. The
transformed data set is stored under the selected name.
Save Cluster Maxima
Given a data set of type Xtremes Time Series x[1], ..., x[n], consider
the exceedances x[i(1)], ..., x[i(k)] over
a predetermined threshold u. The values i(j) are addressed as
exceedance times. Clusters of
exceedance times are built in the following way: Fix some positive
integer r. Any run of at least r
consecutive observations x[i] below the threshold u separates two
clusters. In other words:
between two consecutive clusters of exceedance times there is a minimal
gap of length r.
Xtremes calculates the maxima of all clusters and writes them to a new
data set.
Dialog options:
Order Data
The active univariate data set is sorted. The sorted
values are written to a file. In case of Xtremes Multivariate Data, one
must select a component of the active data set first. The values in the
selected column are sorted in an ascending order, leaving the line
intact, that means, if the position of a specific value must be changed,
the whole line containing it will be moved (recall that multivariate
sample have a matrix structure).
Cumulate Data
The active data set is cumulated, i.e. the k-th value of the cumulated
data set contains the sum of the first k values of the original one.
Symmetrize
The active data set is symmetrized around zero or the median, i.e. given
a sample x[1], ..., x[n], Xtremes generates a data set x[1],
-x[1]+m, ..., x[n], -x[n]+m with either m = 0 or m = F**(-1)(1/2) (with
F denoting the underlying df).
Select Columns
Generate a new multivariate data set by selecting single columns of the
active one. The components of the active data set are displayed on the
left-hand side of the dialog box, those of the new one on the right
side.
They can be moved from on side to the other using the arrow
buttons. The new multvariate data set is composed from the selected
columns. This option can also be utilized to rearrange the components
of the active data set.
Dialog option:
Date Transformation
This operation requires the date (format:
day-month-year) in the first three columns of the data set. If
necessary, apply
the Select Columns option.
The transformation works as follows: the days are enumerated
in an ascending order, where missing days are treated as described
subsequently:
if, for instance, Monday 24th is addressed as 0, then Tuesday 25th will
be 1 and Thursday 27th will be 3 while Wednesday is missing.
The transformed data set is of type Xtremes Multivariate Data and can be
converted to an Xtremes Time Series. In other words: for each day,
Xtremes calculates the difference (in days) to day zero.
Relative Frequencies
Given data of type Xtremes Discrete Data, the
frequencies in the second component are replaced by the pertaining
relative frequencies. The data are of type Xtremes Multivariate
Data.
Fill missing (only multivariate mode)
Missing values of a given multivariate data set are imputed according
to a procedure explained in Statistical Analysis on page
220: A value is randomly selected from the k nearest neighbors.
Convert to
Data sets can be converted to other types by means of several
predefined operations. Choose the
option Transform Data to perform
transformations within a data type.
Convert to Grouped Data
The active univariate data set is converted to a grouped data
set. Specify a partition and a filename for the generated
data set.
The edit fields from, to and step width may be
utilized to create a partition.
This option may be applied to Xtremes Multivariate Data and Xtremes Time
Series as well. In the multivariate case, one must first select two
components of the original data set. A new dat set is generated, with
the first component containing the "cells", the second one the "data".
The new data set is of type Xtremes Grouped Data, while the data
themselves have not been changed. In case of Xtremes Time Series, the
sample remains unchanged, while only the data type is converted to
Xtremes Grouped Data.
Additional dialog options:
Censored | The censoring information is removed from the data set. |
Grouped | The data are distributed equally within the intervall defined by the partition. |
Multivariate | The user is prompted for a component of the data set. |
Discrete | A data set with multiple points is generated. |
Time Series | The first component containing the time is removed. |
Convert to Discrete Data
Convert to Time Series
The active data set is converted to an Xtremes Time Series and written
to a text file. Time series
data are given by pairs ( i, x[i] ), i = 1, ..., n, of discrete
times i and reals x[i]. This option can be applied to univariate and
multivariate data sets.
Dialog options:
If the active data set is of type Xtremes Multivariate Data, the dialog
box provides another option:
Convert to Censored Data
The primary interest concerns data x[1], ..., x[n], yet the
pairs (z[1], d[1]), ..., (z[n], d[n]) are merely observed.
We have z[i] = min (x[i], y[i]), where y[i] are the censoring
values. Moreover, d[i] indicates whether censoring has taken place
or not. We have
d[i] = 1, if x[i] is not censored, and
d[i] = 0, if x[i] is censored by y[i].
Let x[1],..,x[i] be the active data of type Xtremes Univariate
Data. Fix a censoring distribution in the dialog box.
Xtremes will generate the censoring values y[1], ..., y[n] under this
distribution and compute d[i] and z[i]. The data
set (z[1], d[1]), ..., (z[n],d[n]) is of type
Xtremes Censored Data.
Convert to Multivariate Data
An arbitrary number of data sets can be combined to a multivariate
data set. Mark those you want to combine in the list box
showing all loaded data sets.
Zeros will be added at the end of a column if data sets of
different sizes are combined. Grouped and discrete data
sets are not processed any further; Xtremes treats them
like bivariate data sets when this option is applied.
Dialog options
Choose Data
The Choose Data dialog allows the user to choose
a new active data set
from those already having been loaded or generated.
One can also delete data sets from the memory. Such a deletion
will not affect the associated files on your disk. All curves
and estimators based on the data set will be deleted
automatically.
This option is also available by means of a rightclick within the
Xtremes main window.
List Data
The active data set is displayed in a text window.
![]() |