## Improved data handling

Load, transform and analyze in one line. Simple to use formula string notation supports:

- NEW formula strings
- Read SAS and STATA datasets
- Improved Handling of Large Datasets

#### NEW formula strings simplify loading and transforming data

//Load data, transform and estimate in one step

call ols("credit.xlsx", "ln(balance) ~ ln(income) + factor(sex)");

//Load data, reclassifying string column 'state' into numerical categories

X = loadd("census.csv", "income + household_size + reclassify(state)");

GAUSS 19 expands the previously integrated formula string syntax to allow single line data loading, transformation, and analysis. The enhanced formula string syntax :

- Drastically decreases required lines of code.
- Works with?
**CSV**,?**Excel**,?**GAUSS**?datasets,?**HDF5**,?**SAS**?and?**STATA**?datasets.
- Supported automated transformations include:
- apply transformations, such as `ln`, `exp` or any GAUSS procedure
- create interaction terms
- create dummy variables
- reclassify string variables

### Read SAS and STATA datasets

//Load SAS dataset, create interaction term and assign to GAUSS matrix 'X'

X = loadd("advertising.7bdat", "sales + radio * billboards + direct_mail");

//Load Stata dataset and compute descriptive statistics

call dstatmt("auto2.dta", "mpg + weight + gear_ratio");

Sharing data between GAUSS and other software just got easier. Not only does GAUSS 18 allow you to read SAS and STATA datasets, it also allows you to use SAS and STATA datasets directly as a dataset for functions such as OLS, GLM and the General Method of Moments.

- Complete compatibility with SAS and STATA datasets.
- Import SAS and STATA datasets as data matrices.
- Use SAS and STATA datasets as direct arguments in functions such as OLS, GLM and the GMM.

### Improved Handling of Large Datasets

//Load large data in consecutive 1 GB blocks

setBlockSize("1G");

//Load large data in consecutive blocks no larger than 10% of system memory

setBlockSize("10%");

GAUSS 19 makes it easy to access and analyze large datasets with new, simple tools for controlling data processing. Users can now control data processing in terms of:

- Percentage of available memory.
- Simple memory specification.
- Number of rows.

## General Method of Moments (GMM)

New GAUSS 19 GMM procedures estimate parameters from custom?**user specified moment equations**?or analytic?**instrumental variables**?estimates using?**one step**,?**two-step**,?**iterative**, or?**continuously updating**?GMM.

Choose from two new GMM procedures!

**gmmFit**?provides full modeling flexibility including user-specified moment equations.
**gmmFitIV**?provides the analytic generalized method of moments estimates of instrumental variables models.

GAUSS GMM procedures provide new robust, efficient and customizable tools including:

**One-step**,?**two-step**,?**iterative**, and?**continuously updating**?generalized method of moments estimation.
- Optional instrumental variables.
- Standard error and weight matrix options including?
**standard**,?**heteroskedastic robust**, and?**HAC robust**.
- Flexible initial weight matrix specification.

## New Graphics Functionality

- New Color Palette Options and Defaults
- Control graph canvas size programmatically
- Other new features

Convey insights from your data with modern, professional images.

### New Color Palette Options and Defaults

GAUSS 19 color palette management tools allow you to apply professional color schemes to your graphics:

- Choose from over?30 built-in color palettes?designed for optimal visual impacts.
- Premade palettes available for sequential, quantitative and diverging data.
- Colorblind friendly palettes available.

- Create your own palettes
- Create sets of evenly spaced colors in HSL hue space.
- Create sets of evenly spaced circular hues in the HSLuv system.
- Blend colors to create custom color palettes.

### Control graph canvas size programmatically

Easily produce and reproduce graphs that fit where you need them! The new GAUSS procedure `plotCanvasSize` adjusts plot canvas size programmatically based on specifications in centimeters, millimeters, inches or pixels.

//Make this call before your plot to set the graph canvas

//to 750 px by 530 px as shown in the image above

plotCanvasSize("px", 750|530);

#### Other new features

string x_labels = { "Jan", "Feb", "Mar", "Apr", "May", "Jun" };

plotXY(x_labels, y);

- Easily pass in string arrays as 'X' labels to any 2-D graph such as scatter, XY, and others.
- Add any 2-D graph type, such as scatter plots, to contour plots.
- New function?
**plotSetTicLabelFont**?to control the font, font-size and font-color of the X and Y tic labels.

## Speed Improvements

- Chained concatenation operations 2-4x faster
- X'Y for vector-vector case 15%-600% faster
- Inverse incomplete gamma function up to 10x faster

Experience improved speed for a number of fundamental computations in GAUSS 18 such as chained concatenation, vector-vector multiplication, and descriptive statistics.

- Chained concatenation operations 2-4x faster.
- XY for vector-vector case 15%-600% faster for vectors larger than approximately 50 elements.
- Significant speed-up of small matrix indexing.
- Descriptive statistics with?
**dstatmt**?and OLS with function?**ols**?15-30% faster for medium to large matrix inputs.

#### Example nonlinear functions

In order to make the impact of the concatenation and indexing speed-ups more relevant, two of the nonlinear functions used to create the graph shown below the code.

proc fct_a(x); local f1,f2,f3; f1 = 3*x[1]^3 + 2*x[2]^2 + 5*x[3] - 10; f2 = -x[1]^3 - 3*x[2]^2 + x[3] + 5; f3 = 3*x[1]^3 + 2*x[2]^2 -4*x[3]; retp(f1|f2|f3); endp; proc fct_b(x); local ff1, ff2, ff3, ff4, ff5, ff6, ff7, P; P = 20; ff1 = 0.5*x[1] + x[2] + 0.5*x[3] - x[6]/x[7]; ff2 = x[3] + x[4] + 2*x[5] - 2/x[7]; ff3 = x[1] + x[2] + x[5] - 1/x[7]; ff4 = -28837*x[1] - 139009*x[2] - 78213*x[3] + 18927*x[4] + 8427*x[5] + 13492/x[7] - 10690*x[6]/x[7]; ff5 = x[1] + x[2] + x[3] + x[4] + x[5] - 1; ff6 = (P^2)*x[1]*x[4]^3 - 1.7837e5*x[3]*x[5]; ff7 = x[1]*x[3] - 2.6058*x[2]*x[4]; retp(ff1|ff2|ff3|ff4|ff5|ff6|ff7); endp;

## New mathematical functions

- besselk computes the modified Bessel function of the second kind; useful for the negative inverse gaussian distribution.
- rndRayleigh to compute Rayleigh distributed random numbers.
- gmmFit and gmmFitIV for estimation using the generalized method of moments.
- cdfTruncNorm, pdfTruncNorm, cdfLogNorm and pdfLogNorm.
- Optional mu, and sigma inputs for cdfn and pdfn
- Array support for erf, erfc, erfcinv, erfc, pdfn, power op.