Getting Started


VisFlow is a framework that allows you to fast launch data analyses in the web. Watch the short intro video to see how it works, and follow the tutorial steps to start your visualization and analysis.

1-Minute Intro Video


Tutorial


Load Data

Login with demo mode to take a quick glance at VisFlow!

Loading data into VisFlow is as simple as dragging a DataSource onto the canvas and add a dataset to the DataSource. A few sample datasets have already been prepared for you under the demo.

Create a Plot

Drag a plot (in this case we use a Scatterplot onto the canvas and then connect the output of the DataSource to the input of the Scatterplot. You may change the dimensions and other options of the Scatterplot in the control panel on the right.

Forward Selection

Interactive selection in a plot can be forwarded to another node for detailed exploration. We connect the Selection Port to a Table to observe the attributes of the selected data items.

You may press shortcut A to call out the node creation panel. More shortcuts can be found here!

Highlight Selection

VisFlow lets you bind rendering properties to data items so that you can identify interesting subset of the input data.

Connect the selection port of the Scatterplot to a PropertyEditor that sets the rendering property of its input data items. Here we give the selected data items a red color. Rendering properties can be adjusted on-the-fly within VisFlow. To compare the selected items with the whole dataset, we use a Union set operation to combine the selected items with the original dataset forwarded via . We render the unified result in a new Scatterplot.

Introduction


VisFlow is a web-based framework that enables creation of interactive visualizations for tabular data based on flow diagram editing. The flow diagrams of VisFlow defines the system behaviors, including querying and analytics logic. VisFlow employs a special type of flow model, called the subset flow model.

Featuring easy plotting, interactive filtering and highlighting, VisFlow facilitates visualization and data analysis in an simple-to-use web environment, without the hassle of query programming and system engineering.

Flow Diagram


A flow diagram defines how the VisFlow system works. A diagram consists of nodes and directed edges, and is topologically a directed acyclic graph (DAG). The diagrams in VisFlow follow the subset flow model.

Nodes

A node is a primitive component for specific tasks. For example, a DataSource node loads the a tabular data set and a Visualization node presents a plot.

Each node optionally accepts data and produces output data. Data enters and leaves a node via its ports.

Ports

A node has ports located on its boundaries. The input ports on the left of a node defines the input the node accepts. The output ports on the right of a node defines what a node produces for its downflow nodes. Ports are connected by edges.

The connectivity of a port is either single or multiple. A single port can be connected with at most one edge, while a multiple port does not have such restriction.

A port that outputs selected data items is a selection port. Visualization nodes have selection ports.

A port that accepts data subsets is a data port (with white background ). Otherwise it accepts constants and is a constant port (with gray background ).

Edges

Edges define how data items are transmitted and processed across the nodes. Each edge connects two ports, and goes from the output port of an upflow node to an input port of a downflow node.

Subsets

Tabular data rows are considered to be meaningful data items across VisFlow. Items are transmitted between components through data ports in groups. Each transmitted group of data items is therefore a subset of the input tabular dataset.

Constants

Constants are a set of constant values, either specified by a user or extracted from the data attributes. Constants can be used to perform data filtering in filter nodes. For example, a Range Filter accepts two constants that define an attribute filtering range. Constants are transmitted between components through constant ports.

Rendering Properties

Data items can be assigned rendering properties within VisFlow. The rendering properties are transmitted along with the data items and can be modified by nodes within the flow. The visualization nodes in VisFlow respect the rendering properties associated with the data items.

Five types of rendering properties are supported in VisFlow:

  • Color: The filling color of the rendered shape.
  • Border: The color of the rendered shape's border.
  • Width: The width of the rendered shape's border.
  • Size: The size of the rendered shape.
  • Opacity: The opacity of the rendered element.
Note that not all visualizations have corresponding rendering for all the five types of rendering properties. For example, the heatmap does not support size properties.

Subset Flow Model

The flow diagrams in VisFlow follow the subset flow model. The subset flow model defines that all input/output data of the nodes must be a subset of rows of some original input tabular data of the system. Because of the subset relations defined, within VisFlow we are able to uniquely identify a data item and assign rendering property to it, so that it can be rendered in a uniform style across nodes.

The subset flow model restricts data modification within the system. For example, we cannot use a node to add a data column to the table, because by doing so we create new table rows that do not belong to any of the system input tables. This would cause ambiguity in assigning rendering properties and disallow selection highlighting and subset identification. Despite certain loss of data processing power, the subset flow model has the advantages of reducing the flow diagram complexity, and making subset tracking and comparisons easier. Preferably, data can be processed and modified outside the VisFlow system, while visflow is more suitable for quick-start post-processing data visualization and analysis.

Data


Tabular Data

The tabular data applied in VisFlow gives information of data items on a set of dimensions (attributes). It matches the entity-relationship database table definition. Table rows are considered to be unique data items in VisFlow. Table (a) below illustrates a sample VisFlow input table, which includes a data snippet from the World Bank Data.

Data Transpose

To facilitate series data analyses, VisFlow supports a special type of operation on its input table called Data Transpose, which takes a set of primary key attributes and a set of dimensions, and transforms the input table into rows of series points. Dimension names are written in an attribute column, and the original table values are stored in a third column. Table (b) above shows data transpose of table (a) using "Country" as the primary key, and the years as the dimensions.

After data transpose, table (b) can be applied in LineChart series visualization.

Upload Data

To upload custom data to VisFlow, open the upload data dialog by clicking the upload data button on the tool panel, and then choose a local file to upload.

VisFlow currently supports standard CSV data. A tabular dataset to be loaded with D dimensions and N data items (rows) will look as follows:

dimension_name_1,dimension_name_2,...,dimension_name_D
item1_value_1,item1_value_2,...,item1_value_D
item2_value_1,item2_value_2,...,item2_value_D
...
itemN_value_1,itemN_value_2,...,itemN_value_D
VisFlow will automatically recognize the value types for each column, and render the column values correspondingly in the plots.

Diagram Editing


Create Diagram

To create a node, in the menu Edit > Add Node to call out the Node Panel. Alternatively, you may click the node creation panel tab marked with on the left. An in-place node creation panel can be called using the shortcut A.

Within the node creation panel, choose a node type and then drag-and-drop a node from the panel to the canvas.

To create an edge, drag from an output port to an input port (or vice versa). You may also drop an edge onto a node and VisFlow will automatically search for the first connectable port.

Edit Diagram

Node properties can be set in the option panel on the right. By default the option panel pops up when a node is clicked.

Click a node to select it. Drag a node to re-position it. Right-click on nodes, edges or ports to open the context menu.

Hold Alt to box-select multiple nodes at a time. All selected nodes can be re-positioned together.

Within Visualizations, mouse interactions are by default interactive selections. Hold Alt to re-position the underlying visualization node.

Click on the canvas to pan and navigate.

Save / Load Diagram

In the menu Diagram > New/Save/Load to create a new diagram, save the current diagram or load a previously saved diagram from the server.

Visualization Mode (VisMode)

VisFlow features Visualization Mode (VisMode), in which only selected nodes (typically visualizations) are shown and the other diagram details are hidden. The following illustrates a VisMode system and its corresponding flow diagram. Nodes can be set to be visible/invisible in VisMode via context menu or the option panel.

To toggle VisMode, press the VisMode button on the tool panel.

Share Diagram

VisFlow allows you to share a diagram with other users so as to support collaboration. In the save diagram dialog, simply enters the users you want to share the diagram with in the "share with". The users you listed will be able to view and edit your shared diagram. Your uploaded data associated with the shared diagram will be automatically shared with those users as well.

Node Types


Data Source

DataSources are the only type of nodes that load tabular data from data files. They do not have input ports. A data source may load one or more tables of the same type. If multiple tables are loaded, they are combined and considered to be a single larger table throughout the system.

A data source can be set to perform data transpose operation for its input table and output the transformed table instead. Data transpose is useful for plotting series data.

Visualization

Visualizations provide plots of data subsets. Plots are embedded in-situ in the visualization nodes by default. Interactively selected data items are sent out via the visualization selection port. Each visualization node additionally has a forwarding (multiple) port that outputs all its input data. This is for the convenience of getting upflow data.

VisFlow currently supports the following types of visualizations:

Visualization Name Icon Description Example
Table

Lists the data items and their attributes values in a table.

The first column of the table reflects the rendering properties of the data items (color and border).

Scatterplot

Renders a 2-dimensional scatterplot. Axes dimensions can be adjusted in the option panel.

Heatmap

Renders a heatmap. Dimensions to be plotted as heatmap columns and their order can be adjusted in the option panel. One dimension can be used for labeling the heatmap rows. The heatmap color scales can be chosen in the option panel.

Since a heatmap has its own color encoding for attribute values, rendering properties of data items are reflected by the row labels' rendering styles (font color) instead.

Histogram

Renders a histogram. The histogram assigns the data items to a suggested number of bins.

Data items with

Parallel Coordinates

Renders a parallel coordinates plot. Dimensions to be plotted and their order can be adjusted in the option panel.

Network

Renders a network. A network has two input ports, one for the nodes and one for the edges. Correspondingly, there are four output ports, for the selection and forwarding of nodes and edges respectively. One dimension can be used for labeling the network nodes.

The Network node has navigation interaction mode. When navigation is on, you may pan and zoom (by mousewheel) the network. When navigation is off, you may interactively select nodes and edges in the network.

Line Chart

Renders a line chart for series data. The input data must contain a dimension that contains no duplicates as the series dimension. The LineChart will sort the data items by their attribute values of the series dimension, and treat the data items as data points on the series.

There can optionally exist a group-by dimension so that table rows of a same group describe a single series, and there can be multiple series among the input data.

Filter

Filters examine attribute values of data items and perform attribute filtering. VisFlow has three types of filters.

Filter Name Icon Description
Value Filter

Passes through only the items with an attribute value that is equal to a given value, or contains the given value as a substring. The given value can be specified as a regular expression. Value filter supports multiple filtering dimensions, in which case the same filtering value is used on every filtering dimension selected.

The filtering value is set using constants, which can either be entered directly in the Value Filter's input box, or passed in via the constant port.

Range Filter

Passes through only the items with an attribute value within a given value range. If the range values are strings, then the comparison is by the string's lexicographical order. Range filter supports multiple filtering dimensions, in which case the same filtering value is used on every filtering dimension selected.

The filtering range is set using constants, which can either be entered directly in the Range Filter's input boxes, or passed in via the constant ports.

Sampler

Samples the input data items by their attribute values on a filtering dimension and produces a smaller subset.

Sampler has three sampling conditions, minimum/maximum or random sampling. The minimum/maximum condition works like band-limiting filter. The random sampling condition randomly selects items from the input.

Sampler has two sampling modes, count and percentage. A "number to pass" value is specified, say X. With the "count" mode, the sampler outputs exactly X items. For example, when X = 5 and the sampling condition is minimum, then the sampler outputs the 5 smallest items. With the "percentage" mode, the sampler outputs X percent of the input items. For example, when X = 5 and the sampling condition is random sampling, then the sampler outputs 5% of the input items.

Sampler can be set to consider only unique attribute values. If X = 5, the sampling condition is random sampling, and the mode is count, then the sampler outputs the 5 items with distinct attribute values on the filtering dimension. This is useful when you are particularly interested in the distinct values of some dimension.

Value Generator

Value generator produces constants that can be used for filtering.

Value Generator Name Icon Description
Value Maker Produces constants entered manually by the user.
Value Extractor Extracts constants from the attribute values of its data items. Value Extractor can be used to relate two heterogeneous datasets. For example, we can extract the ids of data items from one dataset and then use those names to filter the data items from another dataset.

Set Operation

Set operations allow subset manipulation. VisFlow has three set operations:

Set Operation Name Icon Description
Union

Unifies the input subsets.

The Union node accepts multiple inputs. The rendering properties of the inputs will be combined. The latter connected input will override the previously connected inputs' rendering properties upon a conflict.

Intersect

Outputs the intersection of the input subsets.

The Intersect node accepts multiple inputs. The rendering properties of the inputs will be combined for the items in the intersection. The latter connected input will override the previously connected inputs' rendering properties upon a conflict.

Minus

Subtracts the input subset(s) Y from the input subset X.

The subset to be subtracted from, i.e. X, may only be one subset, while the other subsets Y's can be multiple subsets Y1, Y2, ..., Yn, in which case the output subset is X - Y1 - Y2 - ... - Yn.

The rendering properties associated with X will be used for the output rendering properties.

Property Binder

Property binder lets you assign rendering properties to the data items.

Set Operation Name Icon Description
Property Editor

Directly sets the rendering properties of its input.

Property Mapping

Encodes the attribute values of its input by a user selected rendering property type. For example, you may encode mpg of cars to the rendering property color.

Shortcuts


Diagram

KeyFunction
Ctrl + ENew Diagram
Ctrl + SSave Diagram
Ctrl + LLoad Diagram

Editing

KeyFunction
A Call the node creation panel
L Toggle the node label of the selected node(s)
M Minimize/expand the selected node(s)
P Toggle the visibility of the option panel
V Toggle the visualization mode of the selected node(s)
Shift Add node(s) to the current node selection
Delete / Ctrl + X Delete the selected node(s) / edge

Visualization

KeyFunction
Ctrl + A Select all items in the visualization
Ctrl + Shift + A De-select all items in the vsiualization
Shift Add items to the current item selection
N (network only) Turn on/off the network navigation