Pylint Velocity

This is a simple data visualizer. It uses pylint to find common problems in python code, and graphs the change with each commit to your project.

I'm So Meta, Even This Acronym

Usage

$ git clone https://gitlab.com/robru/pylint-velocity.git
$ cd pylint-velocity
$ mkdir some-project
$ git clone https://some/project some-project/target
# or
$ bzr branch lp:some-project some-project/target
$ make

This will generate a pile of SVGs in the project directory that indicate the changes in certain pylint stats over time. I consider the most interesting to be statements.svg, attributable-lines.svg, fixme.svg, line-too-long.svg, nb duplicated lines.svg, rating.svg, and too-many-branches.svg.

Duplicate Lines in Large Project

The first run takes a couple hours depending on your hardware and the size of the version control history, subsequent runs will be faster as they only collect new data instead of re-scanning the entire vcs each time.

If you have multiple projects, you can specify to just fetch or graph a specific project with the following commands, assuming that the bzr/git repo lives at ./some-project/target:

$ make fetch-some-project
$ make graph-some-project

To run this project on itself to give you an idea of how it works, try this:

$ make example

Configuration

This project consists of a shell script for data acquisition with a linting tool, and a python script for graph production. Each one has it's own configuration.

Linter

To configure the way data is acquired (eg to lint languages other than python or to use a linter other than pylint), create a file called ./some-project/config.sh that defines $GREPS, $ACK and $LINTER.

$GREPS should be a space-separated list of things you'd grep your file list for in order to create a meaningful subset of the total files in your project. Typically this would correspond to the different python modules in your project, but it can be anything. For example, if your project defines modules foo, bar and tests, you might want this in your config.sh:

export GREPS="tests foo bar"

$ACK is the specific command used to determine what files to lint, and defaults to reporting only python files:

export ACK="ack -f --python"

$LINTER is the command used to generate the data to graph. The default is pylint3 with some things disabled:

export LINTER="pylint --disable=import-error,no-member,too-few-public-methods,too-many-public-methods"

Graphs

This directory contains a config.yaml with all available configuration options and their explanations.

Requirements

Blame Data

This project supports cataloging and graphing of changes in blame data over time (graphing the number of lines attributable to each author at each revision). This is enabled by default and does not require any configuration.

Spot Checks

If you have a large project that took a long time to generate, you can perform a spot check on the data by running make check-some-project. This will select 10 data files at random, set them aside, regenerate them, and then diff the old vs the new. If no differences are found it's a sign your data is in good shape.

Written by Robert Bruce Park on Wednesday, August 12, 2015