What is Lisp-Stat?
Lisp-Stat is a domain specific language (DSL) for statistical analysis and machine learning. It is targeted at statistics practitioners with little or no experience in programming.
Relationship to XLISP-Stat
Although inspired by Tierney’s XLisp-Stat, this is a reboot in Common Lisp. XLisp-Stat code is unlikely to run except in trivial cases. Existing XLisp-Stat libraries can be ported with the assistance of the XLS-Compat system.
Core Systems
Lisp-Stat is composed of several systems (projects), each independently useful and brought together under the Lisp-Stat umbrella. Dependencies between systems have been minimised to the extent possible so you can use them individually without importing all of Lisp-Stat.
Data-Frame
A data frame is a data structure conceptually similar to a R data frame. It provides column-centric storage for data sets where each named column contains the values for one variable, and each row contains one set of observations. For data frames, we use the ‘tibble’ from the tidyverse as inspiration for functionality.
Data frames can contain values of any type. If desired, additional attributes, such as the numerical type, unit and other information may be attached to the variable for convenience or efficiency. For example you could specify a unit of length, say m/s (meters per second), to ensure that mathematical operations on that variable always produce lengths (though the unit may change).
DFIO
The Data Frame I/O system provides input and output operations for data frames. A data frame may be written to and read from files, strings or streams, including network streams or relational databases.
Select
Select is a facility for selecting portions of sequences or arrays. It provides:
- An API for making selections (elements selected by the Cartesian
product of vectors of subscripts for each axis) of array-like
objects. The most important function is
select
. Unless you want to define additional methods forselect
, this is pretty much all you need from this library. - An extensible DSL for selecting a subset of valid subscripts. This is useful if, for example, you want to resolve column names in a data frame in your implementation of select, or implementing filtering based on row values.
Array Operations
This library is a collection of functions and macros for manipulating Common Lisp arrays and performing numerical calculations with them. The library provides shorthand codes for frequently used operations, displaced array functions, indexing, transformations, generation, permutation and reduction of columns. Array operations may also be applied to data frames, and data frames may be converted to/from arrays.
Special Functions
This library implements numerical special functions in Common Lisp with a focus on high accuracy double-float calculations. These functions are the basis for the statistical distributions functions, e.g. gamma, beta, etc.
Cephes
Cephes.cl is a CFFI wrapper over the Cephes Math Library, a high quality C implementation of statistical functions. We use this both for an accuracy check (Boost uses these to check its accuracy too), and to fill in the gaps where we don’t yet have common lisp implementations of these functions.
Numerical Utilities
Numerical Utilities is the base system that most others depend on. It is a collection of packages providing:
num=
, et. al. comparison operators for floats- simple arithmetic functions, like
sum
andl2norm
- element-wise operations for arrays and vectors
- intervals
- special matrices and shorthand for their input
- sample statistics
- Chebyshev polynomials
- quadratures
- univariate root finding
- horner’s, simpson’s and other functions for numerical analysis
Lisp-Stat
This is the top level system that uses the other packages to create a
statistical computing environment. It is also the location for the
‘unified’ interface, where the holes are plugged with third party
packages. For example
cl-mathstats contains
functionality not yet in Lisp-Stat, however its architecture does not
lend itself well to incorporation via an ASDF depends-on
, so as we
consolidate the libraries, missing functionality will be placed in the
Lisp-Stat system. Eventually parts of numerical-utilities
,
especially the statistics functions, will be relocated here.
Acknowledgements
Tamas Papp was the original author of many of these libraries. Starting with relatively clean, working, code that solves real-world problems was a great start to the development of Lisp-Stat.
What next?
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.