Vladimir Prus


vladimirprus.com

Tuesday, January 27, 2015

Lean Analytics

Last year, I often needed to display and analyze timestamped events, such as product evaluations, issue tracker activity or credit card expenses. After trying a few approaches, I've ended up writing a JavaScript library called Lean Analytics. It's based on dc.js, crossfilter.js and D3.js, and looks like this:


The easiest way to understand it is to just play with the demo or take a look at the demo source code. Below I'll explain what it is, when you'd want to use it, and when not.

Overview

The primary goal was to just visually show the trends in already collected, but rather dry data. The amount of data is fairly small, dimensions are few, and there's no need to extract hidden correlations between dozens of values nor there's a need for dedicated analysts to tweak the charts on a full-time basics. Rather, I wanted it to be extra easy to chart new type of data, don't store anything in the cloud, and embed the charts in existing web apps.

The library itself is bundled into a single JavaScript file, plus you need to include 3 CSS files. You also need to write code to define where do get data, what metrics to show, and how to group your entries - all of which is straightforward. For that, you get a lot of fine-tuned visuals:

  • Chart showing main metric (such as transaction amount) aggregated per week, as well as derived metric (such as trendline). There are also dropdowns to select desired metrics.
  • Compact linear charts showing distribution of the chosen metric over categories.
  • Tabular view of the data.
  • Filtering of main chart by category values in real time in your browser. The filters are even stored as part of URL, so you can share links easily.
  • Buttons to select time ranges.
  • Automatic progress and error reporting for loading data.
The charts are meant to replace a div in your host HTML document, and they use Bootstrap for styling, so probably will work just fine inside your internal webapps.

Alternatives

DC.js is the foundation for Lean Analytics, and together with crossfilter, does all the hard stuff of filtering data in real time in your browser. It can be used to create way more interesting visualizations, but it requires a considerable amount of code to configure all the details - way more that I was comfortable with.

Several libraries are implementing charts on top of D3, such as NVD3 and C3. Sadly, those are not integrated with crossfilter, and are somewhat in a state of flux.

Google Charts is very solid as far as charting goes, but does not support any crossfiltering either.

Kibana is a full-blown dashboard solution, on top of ElasticSearch. It's certainly great for serious data analysis, but is both not trivial to setup, and is not embeddable in webapps.

Mixpanel is fairly nice, but it's a cloud service, and I did not want, or could not, put data in the cloud.

Zenobase, finally, is a very nice solution specific to lifetracking, to answer questions like "how is my blood pressure correlated with weight". It is inspiring in some ways, but is also a cloud service, and too specified for life tracking to be directly useful.

Conclusion

If you want to chart timestamped events with numeric values that are naturally aggregated over weeks, and you want to filter data by categories in real time, and the amount of data is not very large, give Lean Analytics a try.

4 comments:

Eric Jain said...

Neat! btw Zenobase currently uses Highcharts JS for most of the charts, worth checking out.

Vladimir Prus said...

Thanks for the suggestion! I think I actually looked at Zenobase source before to see what chart library it uses. Highcharts is sort of standard, but I found that D3+dc.js stack is a bit more open/flexible.

Só prá registrar said...

Hi! How do you do the Category selector? DC.js haven't a stacked row chart...

Vladimir Prus said...

The category selector uses a custom chart, over at https://github.com/vprus/lean-analytics/blob/master/src/unrolled-pie-chart.js - it has a slight bug with handling 'other' automatic category, but otherwise works fine.