ProgressiVis is a Python toolkit and scientific workflow system that implements a new programming paradigm that we call Progressive Analytics aimed at performing analytics in a progressive way. It allows analysts to see the progress of their analysis and to steer it while the computation is being done. See the workshop paper.
Instead of running algorithms to completion one after the other, as done in all existing scientific analysis systems, ProgressiVis modules run in short batches, each batch being only allowed to run for a specific quantum of time - typically 1 second - producing a usable result in the end, and yielding control to the next module. To perform the whole computation, ProgressiVis loops over the modules as many times as necessary to converge to a result that the analyst considers satisfactory.
ProgressiVis relies on well known Python libraries, such as numpy,scipy, Pandas, and Scikit-Learn.
For now, ProgressiVis is mostly a proof of concept. It has bugs, but more importantly, the standard Python libraries are not well-suited to progressive execution. In particular, Numpy/SciPy/Pandas are not good at growing arrays/DataFrames dynamically, they require the whole array to be reconstructructed from scratch. This reconstruction is extremely costly currently, but could become almost acceptable with some internal changes. The current implementation provides replacements data structures that can grow and adapt to progressive computation.
ProgressiVis can be installed with pip with or without virtualenv. From a virtualenv or from the global environment, install it with:
pip install -e requirements.txt
python setup.py install
or, with anaconda:
conda config --add channels progressivis
conda install progressivis
To see examples, either look at the tests in the tests
directory, or
try the examples in the examples
directory.
If you are having issues, please let us know at issue.
The project is licensed under the BSD license.