Parallel Coordinates and Application to Analysis and Control

Summary of First Year Conclusions

A. In principle, parallel coordinates can be used to effect a quantitative visual representation of a multivariate data set; in other words, allow the user to see all the data in a single graphical view. However, this way of viewing data presents an unfamiliar visual paradigm, and one that is not necessarily independent of graphical parameters which are themselves artifacts of the parallel coordinate representation and are unrelated to the data, per se. In effect, these parameters introduce a form of "visual distortion". While distortion is not intrinsically a bad thing, it must be used with care; otherwise it will effect the way the user evaluates the real data. From a practical point of view, parallel coordinates can be used as an adjunct to, but not replacement for, more traditional means of visualizing power plant operations data.
 
B. Experimentation with model plant data sets as generated by the DHR model indicates that data scaling may be a very fundamental limitation on the use of parallel coordinates for electric power plant data visualization. This model involves about 145 data input variables and about 160 output variables. These all represent a variety of thermodynamic states associated with the plant's components (e.g. turbines, reheaters, boilers, etc.,). These thermodynamic variables cover a very wide range of numerical values; for example, temperatures values cover a range of 2 to 3 orders of magnitude depending on whether we are concerned with superheated steam at a turbine inlet or liquid water in a pump. The scale problem is somewhat reduced, but not eliminated, if intensive, as opposed to extensive, thermodynamic properties are used. While our graphical tools can be used to effect rescaling, the bottom line is that the ability to employ parallel coordinates to look for "patterns" in the data, or to visually identify correlated data anomlies is greatly impaired by the scaling problem.
 
The scaling problem, as discussed above, is of primary concern if one attempts to use parallel coordinates representation as a tool for global plant data monitoring. That is, the scaling issue is most apparent when a variety of physical data representing many different plant variables is viewed simultaneously. If selected variables, for example, only a turbine's collection of inlet temperatures, are isolated for parallel coordinate visualization, then the scaling problem is greatly reduced. In a MS Thesis by F. Bundy at the Univ. of Florida real data from a co-generation plant was studied as an exercise in parallel coordinates visualization. Because the physical data sets chosen were representative of plant components (as opposed to the plant as a whole), Bundy's results were not severely affected by incommensurate scaling, and the parallel coordinate visualization resulted in some interesting visual patterns.
 
C. While the scaling problem is very serious for power plant data sets, it is much less of an issue (or perhaps not an issue at all) when dealing with data associated with power distribution. Data associated with line voltages, load factors, etc. can usually be scaled in a way that makes these data more amenable to parallel coordinate representation.
 
D. A useful way to employ parallel coordinates for power plant data is to use it for comparison of real and ideal thermodynamic processes for particular plant components. Used this way parallel component visualization becomes a powerful tool that can help identify thermodynamic inefficiencies that might be associated with particular plant components. It is in this context that parallel coordinate visualization tools might prove useful as a part of a control system.
 
E. One experiment of interest was to see if parallel coordinates would be useful in identification of plant component data changes that are associated with anomalous values such as an out of range or dangerously high pressure. In the attempt to use the DRH model for this type of trial it was found that the model, which is based on a neural network learning code, was responding in unphysical ways to anomalous input variables. For this portion of the work, we chose to introduce a difference model, namely the PSLAM model.
 
F. In the course of modifing the SCENE visualization tools for parallel coordinate visualization, provision was made for software "hooks" so that a "closed loop" could be effected that would allow construction of a plant monitor and control simulation. Building such a simulation required access to the plant simulation model source code; proprietary components in the DHR model precluded the needed code access. PSLAM was used instead.
 
G. Although the PSLAM work was collateral to the primary project tasks, it was worthwhile to pursue because there is need in the community for a "public domain" power plant model that is reasonably realistic in its simulation capabilities, while not being too specialized, e.g. an operator training model. The following was done:
  • PSLAM code (FORTAN) was upgraded to be consistent with modern FORTRAN
  • PSLAM was ported (from DOS) to Unix and PowerMac platforms so that it runs on all major platforms.
  • A PSLAM web page<link to PSLAM page> was created to facilitate community access to the code, together installation and porting instructions.
A more sophisticated model called CAMEL was also investigated and work begun to upgrade and port it, but this work was temporarily suspended. CAMEL retains the simulation features of PSLAM but adds expert system and data inference capabilities.


Table of Contents

Bibliography