This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Contribution Guidelines

How to contribute to Lisp-Stat

This section describes the mechanics of how to contribute code to Lisp-Stat. Legal requirements, community guidelines, code of conduct, etc. For details on how to contribute code and documentation, see links on nav sidebar to the left under Contributing.

For ideas about what you might contribute, please see open issues on github and the ideas page. The organisation repository contains the individual sub-projects. Contributions to documentation are especially welcome.

Contributor License Agreement

Contributor License Agreements (CLAs) are common and accepted in open source projects. We all wish for Lisp-Stat to be used and distributed as widely as possible, and for its users to be confident about the origins and continuing existence of the code. The CLA help us achieve that goal. Although common, many in the Lisp community are unaware of CLA or their importance. Some often asked questions include:

Why do you need a CLA?

We need a CLA because, by law, all rights reside with the originator of a work unless otherwise agreed. The CLA allows the project to accept and distribute your contributions. Without your consent via a CLA, the project has no rights to ues the code. Here’s what Google has to say in their CLA policy page:

Standard inbound license

Using one standard inbound license that grants the receiving company broad permission to use contributed code in products is beneficial to the company and downstream users alike.

Technology companies will naturally want to make productive use of any code made available to them. However, if all of the code being received by a company was subject to various inbound licenses with conflicting terms, the process for authorizing the use of the code would be cumbersome because of the need for constant checks for compliance with the various licenses. Whenever contributed code were to be used, the particular license terms for every single file would need to be reviewed to ascertain whether the application would be permitted under the terms of that code’s specific license. This would require considerable human resources and would slow down the engineers trying to utilize the code.

The benefits that a company receives under a standard inbound license pass to downstream users as well. Explicit patent permissions and disclaimers of obligations and warranties clarify the recipients’ rights and duties. The broad grant of rights provides code recipients opportunities to make productive use of the software. Adherence to a single standard license promotes consistency and common understanding for all parties involved.

Why do I have to sign?

In order to be legally binding a certain amount of legal ceremony must take place. This varies by jurisdiction. In some places, ‘clickwrap’ or ‘browse wrap’ agreements are used, however these are gray areas of legal validity. A ‘wet signature’ is valid everywhere and avoids ambiguity of assent.

What does it do?

The CLA essentially does three things. It ensures that the contributor agrees:

  1. To allow the project to use the source code and redistribute it
  2. The contribution is theirs to give, e.g. does not belong to their employer or someone else
  3. Does not contain any patented ‘stuff’.

How long will it take?

The entire process should take less than an hour. 30-40 minutes is typical.

Mechanics of the CLA

The Lisp-Stat project uses CLAs to accept regular contributions from individuals and corporations, and to accept larger grants of existing software products, for example if you wished to contribute a large XLISP-STAT library.

Contributions to this project must be accompanied by a Contributor License Agreement. You (or your employer) retain the copyright to your contribution; this simply gives us permission to use and redistribute your contributions as part of the project.

You generally only need to submit a CLA once, so if you have already submitted one (even if it was for a different project), you probably do not need to do it again. To get the process started, download and sign the CLA (A4, US-Letter), then, in your PR (pull request), include a copy in a /LICENSE directory of the repository, creating the directory if it doesn’t exist. This needs to be done only once per contributor.

Code of Conduct

The following code of conduct is not meant as a means for punishment, action or censorship for the mailing list or project. Instead, it is meant to set the tone, expectations and comfort level for contributors and those wishing to participate in the community.

  • We ask everyone to be welcoming, friendly, and patient.
  • Flame wars and insults are unacceptable in any fashion, by any party.
  • Anything can be asked, and “RTFM” is not an acceptable answer.
  • Neither is “it’s in the archives, go read them”.
  • Statements made by core developers can be quoted outside of the list.
  • Statements made by others can not be quoted outside the list without explicit permission. - Anonymised paraphrased statements “someone asked about…” are OK - direct quotes with or without names are not appropriate.
  • The community administrators reserve the right to revoke the subscription of members (including mentors) that persistently fail to abide by this Code of Conduct.

1 - Contributing Code

How to contribute code to Lisp-Stat

First, ensure you have signed a contributor license agreement. Then follow these steps for contributing to Lisp-Stat:

You may also be interested in the additional information at the end of this document.

Get source code

First you need the Lisp-Stat source code. The core systems are found on the Lisp-Stat github page. For the individual systems, just check out the one you are interested in. For the entire Lisp-Stat system, at a minimum you will need:

Other dependencies will be pulled in by Quicklisp.

Development occurs on the “master” branch. To get all the repos, you can use the following command in the directory you want to be your top level dev space:

git clone https://github.com/Lisp-Stat/data-frame.git && \
git clone https://github.com/Lisp-Stat/dfio.git && \
git clone https://github.com/Lisp-Stat/special-functions.git && \
git clone https://github.com/Lisp-Stat/numerical-utilities.git && \
git clone https://github.com/Lisp-Stat/documentation.git && \
git clone https://github.com/Lisp-Stat/lisp-stat.git && \
git clone https://github.com/Lisp-Stat/plot.git && \
git clone https://github.com/Lisp-Stat/select.git && \
git clone https://github.com/Lisp-Stat/array-operations.git && \
git clone https://github.com/Lisp-Stat/sqldf.git

Modify the source

Before you start, send a message to the Lisp-Stat mailing list or file an issue on Github describing your proposed changes. Doing this helps to verify that your changes will work with what others are doing and have planned for the project. Importantly, there may be some existing code or design work for you to leverage that is not yet published, and we’d hate to see work duplicated unnecessarily.

Be patient, it may take folks a while to understand your requirements. For large systems or design changes, a design document is preferred. For small changes, issues and the mailing list are fine.

Once your suggested changes are agreed, you can modify the source code and add some features using your favorite IDE.

The following sections provide tips for working on the project:

Coding Convention

Please consider the following before submitting a pull request:

  • Code should be formatted according to the Google Common Lisp Style Guide
  • All code should include unit tests. Currently we use fiveam as the test framework for new projects, but are looking at Parachute and Rove as more extensible alternatives.
  • Contributions should pass existing unit tests
  • New unit tests should be provided to demonstrate bugs and fixes
  • Indentation in Common Lisp is important for readability. Contributions should adhere to these guidelines. For the most part, a properly configured Emacs will do this automatically.

Code review

Github includes code review tools that can be used as part of a pull request. We recommend using a triangular workflow and feature/bug branches in your own repository to work from. Once you submit a pull request, one of the committers will review it and possibly request modifications.

As a contributor you should organise (squash) your git commits to make them understandable to reviewers:

  • Combine WIP and other small commits together.
  • Address multiple issues, for smaller bug fixes or enhancements, with a single commit.
  • Use separate commits to allow efficient review, separating out formatting changes or simple refactoring from core changes or additions.
  • Rebase this chain of commits on top of the current master
  • Write a good git commit message

Once all the comments in the review have been addressed, a Lisp-Stat committer completes the following steps to commit the patch:

  • If the master branch has moved forward since the review, rebase the branch from the pull request on the latest master and re-run tests.
  • If all tests pass, the committer amends the last commit message in the series to include “this closes #1234”. This can be done with interactive rebase. When on the branch issue: git rebase -i HEAD^
    • Change where it says “pick” on the line with the last commit, replacing it with “r” or “reword”. It replays the commit giving you the opportunity the change the commit message.
    • The committer pushes the commit(s) to the github repo
    • The committer resolves the issue with a message like "Fixed in <Git commit SHA>".

Additional Info

Where to start?

If you are new to statistics or Lisp, documentation updates are always a good place to start. You will become familiar with the workflow, learn how the code functions and generally become better acquainted with how Lisp-Stat operates. Besides, any contribution will require documentation updates, so it’s good to learn this system first.

If you are coming from an existing statistical environment, consider porting a XLispStat package that you find useful to Lisp-Stat. Use the XLS compatibility layer to help. If there is a function missing in XLS, raise an issue and we’ll create it. Some XLispStat code to browse:

Keep in mind that some of these rely on the XLispStat graphics system, which was native to the platform. LISP-STAT uses Vega for visualizations, so there isn’t a direct mapping. Non-graphical code should be a straight forward port.

You could also look at CRAN, which contains thousands of high-quality packages.

For specific ideas that would help, see the ideas page.

Issue Guidelines

Please comment on issues in github, making your concerns known. Please also vote for issues that are a high priority for you.

Please refrain from editing descriptions and comments if possible, as edits spam the mailing list and clutter the audit trails, which is otherwise very useful. Instead, preview descriptions and comments using the preview button (on the right) before posting them. Keep descriptions brief and save more elaborate proposals for comments, since descriptions are included in GitHub automatically sent messages. If you change your mind, note this in a new comment, rather than editing an older comment. The issue should preserve this history of the discussion.

2 - Contributing to Documentation

You can help make Lisp-Stat documentation better

Creating and updating documentation is a great way to learn. You will not only become more familiar with Common Lisp, you have a chance to investigate the internals of all parts of a statistical system.

We use Hugo to format and generate the website, the Docsy theme for styling and site structure, and Netlify to manage the deployment of the documentation site (what you are reading now). Hugo is an open-source static site generator that provides us with templates, content organisation in a standard directory structure, and a website generation engine. You write the pages in Markdown (or HTML if you want), and Hugo wraps them up into a website.

All submissions, including submissions by project members, require review. We use GitHub pull requests for this purpose. Consult GitHub Help for more information on using pull requests.

Repository Organisation

Declt generates documentation for individual systems in Markdown format. These are kept with the project, e.g. select/docs/select.md.

Conventions

Please follow the Microsoft Style Guide for technical documentation.

Quick Start

Here’s a quick guide to updating the docs. It assumes you are familiar with the GitHub workflow and you are happy to use the automated preview of your doc updates:

  1. Fork the Lisp-Stat documentation repo on GitHub.
  2. Make your changes and send a pull request (PR).
  3. If you are not yet ready for a review, add “WIP” to the PR name to indicate it’s a work in progress. (Don’t add the Hugo property “draft = true” to the page front matter, because that prevents the auto-deployment of the content preview described in the next point.)
  4. Wait for the automated PR workflow to do some checks. When it’s ready, you should see a comment like this: deploy/netlify — Deploy preview ready!
  5. Click Details to the right of “Deploy preview ready” to see a preview of your updates.
  6. Continue updating your doc and pushing your changes until you’re happy with the content.
  7. When you’re ready for a review, add a comment to the PR, and remove any “WIP” markers.

Updating a single page

If you’ve just spotted something you’d like to change while using the docs, Docsy has a shortcut for you (do not use this for reference docs):

  1. Click Edit this page in the top right hand corner of the page.
  2. If you don’t already have an up to date fork of the project repo, you are prompted to get one - click Fork this repository and propose changes or Update your Fork to get an up to date version of the project to edit. The appropriate page in your fork is displayed in edit mode.
  3. Follow the rest of the Quick Start process above to make, preview, and propose your changes.

Previewing locally

If you want to run your own local Hugo server to preview your changes as you work:

  1. Follow the instructions in Getting started to install Hugo and any other tools you need. You’ll need at least Hugo version 0.45 (we recommend using the most recent available version), and it must be the extended version, which supports SCSS.

  2. Fork the Lisp-Stat documentation repo into your own repository project, then create a local copy using git clone. Don’t forget to use --recurse-submodules or you won’t pull down some of the code you need to generate a working site.

    git clone --recurse-submodules --depth 1 https://github.com/lisp-stat/documentation.git
    
  3. Run hugo server in the site root directory. By default your site will be available at http://localhost:1313/. Now that you’re serving your site locally, Hugo will watch for changes to the content and automatically refresh your site.

  4. Continue with the usual GitHub workflow to edit files, commit them, push the changes up to your fork, and create a pull request.

Creating an issue

If you’ve found a problem in the docs, but are not sure how to fix it yourself, please create an issue in the Lisp-Stat documentation repo. You can also create an issue about a specific page by clicking the Create Issue button in the top right hand corner of the page.

Useful resources

3 - Contribution Ideas

Some ideas on how contribute to Lisp-Stat

SQLite

There isn’t a good, maintained wrapper for SQLite that doesn’t have a restricted license. Using CFFI and autowrap, create a lisp interface for SQLite. This will allow us to use sqldf with something other than PostgreSQL.

Special Functions

The functions underlying the statistical distributions require skills in numerical programming. If you like being ‘close to the metal’, this is a good area for contributions. Suitable for medium-advanced level programmers. In particular we need implementations of:

  • gamma
  • incomplete gamma (upper & lower)
  • inverse incomplete gamma

This work is partially complete and makes a good starting point for someone who wants to make a substantial contribution.

Documentation

Better and more documentation is always welcome, and a great way to learn. Suitable for beginners to Common Lisp or statistics.

Jupyter-Lab Integrations

Jupyter Lab has two nice integrations into Pandas, the Python version of Data-Frame, that would make great contributions: Qgrid, which allows editing a data frame in Jupyter Lab, and Jupyter DataTables. There are many more Pandas/Jupyter integrations, and any of them would be welcome additions to the Lisp-Stat ecosystem.

Plotting

LISP-STAT has a basic plotting system, but there is always room for improvement. An interactive REPL based plotting system should be possible with a medium amount of effort. Remote-js provides a working example of running JavaScript in a browser from a REPL, and could combined with something like Electron and a DSL for Vega-lite specifications. This may be a 4-6 week project for someone with JavaScript and HTML skills. There are other Plotly/Vega options, so if this interests you, open an issue and we can discuss. I have working examples of much of this, but all fragmented examples. Skills: good web/JavaScript, beginner lisp.

Regression

We have some code for ‘quick & dirty’ regressions and need a more robust DSL (Domain Specific Language). As a prototype, the -proto regression objects from XLISP-STAT would be both useful and be a good experiment to see what the final form should take. This is a relatively straightforward port, e.g. defproto -> defclass and defmeth -> defmethod. Skill level: medium in both Lisp and statistics, or willing to learn.

Vector Mathematics

We have code for vectorized versions of all Common Lisp functions, living in the elmt package. It now only works on vectors. Shadowing Common Lisp mathematical operators is possible, and more natural. This task is to make elmt vectorized math functions work on lists as well as vectors, and to implement shadowing of Common Lisp. This task requires at least medium-high level Lisp skills, since you will be working with both packages and shadowing. We also need to run the ANSI Common Lisp conformance tests on the results to ensure nothing gets broken in the process.

Continuous Integration

If you have experience with Github’s CI tools, a CI setup for Lisp-Stat would be a great help. This allows people making pull requests to immediately know if their patches break anything. Beginner level Lisp.