The homogeneity of groups of 16-dimensional wind direction roses (obtained by hierarchical clustering in a previous report) is discussed through the application of Andrews’ Curves. Principal Component Analysis (PCA) ...The homogeneity of groups of 16-dimensional wind direction roses (obtained by hierarchical clustering in a previous report) is discussed through the application of Andrews’ Curves. Principal Component Analysis (PCA) is employed to reduce dimensionality and to provide an ordering of the variables to compute Andrews’ Curves. Our results suggest that Andrews’ Curves greatly facilitate the visualization of homogeneity as well as reveal information that allows improving the clusters’ arrangement. A combined analysis employing Andrews’ Curves and Calinkski and Harabasz’ approach (a method for determining the optimal number of groups) helps to assess the strength of the group structure of the data as well as to detect anomalies such as misclassified objects or atypical values. Furthermore, it allows finding out that the 24 original seasonal hourly roses (representing the “day”) become better represented by 6 groups (rather than by 5 as proposed in the previous report). The new group arrangement was consistent with the dendogram for another cut-off distance. As a result the wind occurrences are now represented by a more detailed and smooth pattern: there is a decrease in northern wind between midday and twilight while eastern winds become more important towards the evening. The methodology proposed is a subject to be considered to become part of an automated system.展开更多
文摘The homogeneity of groups of 16-dimensional wind direction roses (obtained by hierarchical clustering in a previous report) is discussed through the application of Andrews’ Curves. Principal Component Analysis (PCA) is employed to reduce dimensionality and to provide an ordering of the variables to compute Andrews’ Curves. Our results suggest that Andrews’ Curves greatly facilitate the visualization of homogeneity as well as reveal information that allows improving the clusters’ arrangement. A combined analysis employing Andrews’ Curves and Calinkski and Harabasz’ approach (a method for determining the optimal number of groups) helps to assess the strength of the group structure of the data as well as to detect anomalies such as misclassified objects or atypical values. Furthermore, it allows finding out that the 24 original seasonal hourly roses (representing the “day”) become better represented by 6 groups (rather than by 5 as proposed in the previous report). The new group arrangement was consistent with the dendogram for another cut-off distance. As a result the wind occurrences are now represented by a more detailed and smooth pattern: there is a decrease in northern wind between midday and twilight while eastern winds become more important towards the evening. The methodology proposed is a subject to be considered to become part of an automated system.