This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Examples

Using Lisp-Stat in the real world

One of the best ways to learn Lisp-Stat is to see examples of actual work. This section contains examples of performing statistical analysis, derived from the book Introduction to the Practices of Statistics (2017) by Moore, McCabe and Craig and plotting from the Vega-Lite example gallery.

1 - Notebooks

From the ninth edition of the book, Introduction to the Practice of Statistics

These notebooks describe how to undertake statistical analyses introduced as examples in the Ninth Edition of Introduction to the Practices of Statistics (2017) by Moore, McCabe and Craig. The notebooks are organised in the same manner as the chapters of the book. The data comes from the site IPS9 in R by Nicholas Horton.

The notebooks are implemented using a third-party library, common-lisp-jupyter and are known to work with revision b1021ab.

Looking at data

Chapter 1 – Distributions : Exploratory data analysis using plots and numbers

2 - Plotting

Example plots

The plots here show equivalents to the Vega-Lite example gallery.

Preliminaries

Load Vega-Lite

Load Vega-Lite and network libraries:

(ql:quickload :lisp-stat)
(ql:quickload :plot/vglt)
(ql:quickload :dexador)
(ql:quickload :access)

Load example data

(in-package :lisp-stat)
(defparameter vega-cars
  (vglt:vl-to-df
    (dex:get
	  "https://raw.githubusercontent.com/vega/vega-datasets/master/data/cars.json"
	  :want-stream t)))

Strip plot

The Vega-Lite strip plot example shows the relationship between horsepower and the number of cylinders using tick marks.

In this example we will show how to build a spec from beginning to end, without using a plot template.

JSON

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "Shows the relationship between horsepower and the number of cylinders using tick marks.",
  "data": {"url": "data/cars.json"},
  "mark": "tick",
  "encoding": {
    "x": {"field": "Horsepower", "type": "quantitative"},
    "y": {"field": "Cylinders", "type": "ordinal"}
  }
}

Lisp-Stat

(defparameter cars-strip-plot
  (line-up-first
	(vglt:spec)
    (vglt:add "description" "Shows the relationship between horsepower and the number of cylinders using tick marks.")
	(vglt:add "data" `(("values" . ,(vglt:df-to-alist vega-cars))))
	(vglt:add "mark" "tick")
	(vglt:add "encoding" '(("x" ("field" . "HORSEPOWER") ("type" . "quantitative") ("title" . "Horsepower"))
	                       ("y" ("field" . "CYLINDERS")  ("type" . "ordinal") ("title" . "Cylinders"))))))
(plot:plot-from-file (vglt:save-plot 'cars-strip-plot))

Scatter plots

Basic

A basic Vega-Lite scatterplot showing horsepower and miles per gallons for various cars.

Horsepower vs. MPG scatter plot

In this example we use the Lisp-Stat template for a basic scatter plot.

JSON

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A scatterplot showing horsepower and miles per gallons for various cars.",
  "data": {"url": "data/cars.json"},
  "mark": "point",
  "encoding": {
    "x": {"field": "Horsepower", "type": "quantitative"},
    "y": {"field": "Miles_per_Gallon", "type": "quantitative"}
  }
}

Lisp-Stat

(defparameter cars-scatter-plot
  (vglt:scatter-plot vega-cars "HORSEPOWER" "MILES_PER_GALLON"))
(plot:plot-from-file (vglt:save-plot 'cars-scatter-plot))

Colored

In this example we’ll show how to modify a plot that was based on one of the the Lisp-Stat plotting templates. We’d like to add some additional information to the cars scatter plot to show the cars origin. The Vega-Lite example shows that we have to add two new directives to the encoding of the plot:

(pushnew
 '("color" . (("field" . "ORIGIN") ("type" . "nominal")))
 (access:accesses cars-scatter-plot :encoding))
(pushnew
 '("shape" . (("field" . "ORIGIN") ("type" . "nominal")))
 (access:accesses cars-scatter-plot :encoding))
(plot:plot-from-file (vglt:save-plot 'cars-scatter-plot))

With this change we can see that the higher horsepower, lower efficiency, cars are from the USA, and the higher efficiency cars from Japan and Europe.

Text marks

The same information, but further indicated with a text marker. This Vega-Lite example is sufficiently different from the template that we’ll construct it all here. Notice the use of a data transformation.

JSON

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "data": {"url": "data/cars.json"},
  "transform": [{
    "calculate": "datum.Origin[0]",
    "as": "OriginInitial"
  }],
  "mark": "text",
  "encoding": {
    "x": {"field": "Horsepower", "type": "quantitative"},
    "y": {"field": "Miles_per_Gallon", "type": "quantitative"},
    "color": {"field": "Origin", "type": "nominal"},
    "text": {"field": "OriginInitial", "type": "nominal"}
  }
}

Lisp-Stat

(defparameter cars-scatter-text-plot
   (line-up-first
    (vglt:spec)
	(vglt:add "data" `(("values" . ,(vglt:df-to-alist vega-cars))))
	(vglt:add "transform" #((("calculate" . "datum.ORIGIN[0]") ("as" . "OriginInitial"))))
	(vglt:add "mark" "text")
	(vglt:add "encoding" '(("x" ("field" . "HORSEPOWER") ("type" . "quantitative") ("title" . "Horsepower"))
	                       ("y" ("field" . "MILES_PER_GALLON") ("type" . "quantitative") ("title" . "Miles per Gallon"))
	                       ("color" . (("field" . "ORIGIN") ("type" . "nominal")))
					       ("text" . (("field" . "OriginInitial") ("type" . "nominal")))))))
(plot:plot-from-file (vglt:save-plot 'cars-scatter-text-plot))

Interactive scatter plot matrix

This Vega-Lite interactive scatter plot matrix includes interactive elements and demonstrates creating a SPLOM (scatter plot matrix).

Above is a PNG file. The interactive version is here.

JSON

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "repeat": {
    "row": ["Horsepower", "Acceleration", "Miles_per_Gallon"],
    "column": ["Miles_per_Gallon", "Acceleration", "Horsepower"]
  },
  "spec": {
    "data": {"url": "data/cars.json"},
    "mark": "point",
    "params": [
      {
        "name": "brush",
        "select": {
          "type": "interval",
          "resolve": "union",
          "on": "[mousedown[event.shiftKey], window:mouseup] > window:mousemove!",
          "translate": "[mousedown[event.shiftKey], window:mouseup] > window:mousemove!",
          "zoom": "wheel![event.shiftKey]"
        }
      },
      {
        "name": "grid",
        "select": {
          "type": "interval",
          "resolve": "global",
          "translate": "[mousedown[!event.shiftKey], window:mouseup] > window:mousemove!",
          "zoom": "wheel![!event.shiftKey]"
        },
        "bind": "scales"
      }
    ],
    "encoding": {
      "x": {"field": {"repeat": "column"}, "type": "quantitative"},
      "y": {
        "field": {"repeat": "row"},
        "type": "quantitative",
        "axis": {"minExtent": 30}
      },
      "color": {
        "condition": {
          "param": "brush",
          "field": "Origin",
          "type": "nominal"
        },
        "value": "grey"
      }
    }
  }
}

Lisp-Stat equivalent

(defparameter cars-interactive-splom
  (line-up-first
   (vglt:spec)
   (vglt:add "repeat" '(("row" . #("HORSEPOWER" "ACCELERATION" "MILES_PER_GALLON"))
			            ("column" . #("MILES_PER_GALLON" "ACCELERATION" "HORSEPOWER"))))
   (vglt:add "spec"
             `(("data" ("values" . ,(vglt:df-to-alist vega-cars)))
		      ("mark" . "point")
		      ("params" . #(
			        (("name" . "brush")
				     ("select"
				      ("type" . "interval")
				      ("resolve" . "union")
				      ("on" . "[mousedown[event.shiftKey], window:mouseup] > window:mousemove!")
				      ("translate" . "[mousedown[event.shiftKey], window:mouseup] > window:mousemove!")
				      ("zoom" . "wheel![event.shiftKey]")))
				    (("name" . "grid")
				     ("select"
				      ("type" . "interval")
				      ("resolve" . "global")
				      ("translate" . "[mousedown[!event.shiftKey], window:mouseup] > window:mousemove!")
				      ("zoom" . "wheel![!event.shiftKey]"))
				      ("bind" . "scales"))))
		      ("encoding" . (("x" ("field" ("repeat" . "column")) ("type" . "quantitative"))
				             ("y" ("field" ("repeat" . "row")) ("type" . "quantitative") ("axis" ("minExtent" . 30)))
				             ("color" ("condition" ("param" . "brush")
							                       ("field" . "ORIGIN")
							                       ("type" . "nominal"))
					                  ("value" . "grey"))))))))
(plot:plot-from-file (vglt:save-plot 'cars-interactive-splom))