I got a chance to see Lyzasoft’s new product in action recently. Lyzasoft aims to provide a desktop product for business people to do analysis that can seamlessly scale up, unlike (say) spreadsheet based analysis. The product is based around a column store.
Workbooks are the core metaphor and these are used to assemble flows. Data connections are the first step in these flows and can be created from Access, text files, Oracle database etc. Users can drag and drop various elements – a stack of queries, perhaps, that are linked. Data is then sucked into the column store. A nice drag and drop interface allows joins, appends etc to be added. Each node in the workbook flow consists of Input – instructions – outputs and it is easy for users to chain these together. For each node the user sees input data at the top for the sources being manipulated. Simple operations and drag and drop can then be used to take action. For instance, similar columns can be dragged so that the tool knows that they can be stacked. Users can also set default values, define formatting and more as they work on the data. It is easy to …
Copyright © 2009 James Taylor. Visit the original article at First Look – Lyza.
I got a chance to see Lyzasoft’s new product in action recently. Lyzasoft aims to provide a desktop product for business people to do analysis that can seamlessly scale up, unlike (say) spreadsheet based analysis. The product is based around a column store.
Workbooks are the core metaphor and these are used to assemble flows. Data connections are the first step in these flows and can be created from Access, text files, Oracle database etc. Users can drag and drop various elements – a stack of queries, perhaps, that are linked. Data is then sucked into the column store. A nice drag and drop interface allows joins, appends etc to be added. Each node in the workbook flow consists of Input – instructions – outputs and it is easy for users to chain these together. For each node the user sees input data at the top for the sources being manipulated. Simple operations and drag and drop can then be used to take action. For instance, similar columns can be dragged so that the tool knows that they can be stacked. Users can also set default values, define formatting and more as they work on the data. It is easy to add filters and other transformations and Excel-like formula building in column definitions allows things like “previous purchase” to be defined as a column. Nodes include summarization (non destructive), filtering (destructive), calculations (additive), joins (could do anything), sourcing decisions and more.
The tool is designed to handle large data sets and flag issues (like missing data) automatically. It takes seconds to import millions of rows and it is very quick to display results, filter down by values, summarize etc. Everything is designed to make it possible for non-techies to work effectively. The join node, for instance, has nice visual clues using a Venn diagram and handles conversions of data elements so the join can be defined. The speed allows constant visual feedback for users so can see the results of an action, decide if that is what they want / expected and either undo or continue. They do not have to worry about the technicalities – is this an inner or outer join for instance?
Users can build nice graphs and generate trace documents from XML specification of the environment. Everything is traceable and visible. If a user wants to build on someone else’s work and have access to their analysis they can see the trace all the way back. This means any shared analysis is understandable and the traceability is one of the product’s best features. In addition this XML-based information specification can be moved to a server based environment. This allows companies to bridge ad-hoc, end-user centric analysis to IT. No re-do. No spreadsheet brittleness, very nice as this allows people to answer the question “What’s in that number?” – the derivation of summary information is key and is made visible by the product. The tool also allows “re-parenting” so that a temporary source (say a file dumped out of a database) can be replaced (say with a live connection to the data). This is a powerful feature for creating the seamless promotion from end-user to centrally managed.
There is a web services API for the flow and access to enterprise databases in the enterprise version and a light version without enterprise connectivity or APIs. In addition there is a commons version for brokered peer to peer sharing of analysis. Servers can allow analysts to create pub/sub relationships with each other to share analysis and these can be monitored. The intent is to make it possible to manage analysts, replace people who quit, update schedules and so on. Cut and paste replaced with links through a shared commons. They are adding a web front end so that non-Lyza users can consume/comment on reports and analysis also.
They are adding some stats and analytics e.g. stepwise regression but there is clearly more to come.
I really liked both the ease of use and the way in which end users are brought into the tool without being condemned to a marginal existence – the same analysis can be created locally and then shared effectively as part of a real IT system. The traceability and the declarative nature of the tool were both great.