Analysis

The purpose of the Analysis module is to make it easy for users to analyze time series data. We provide a variety of analyzers to inspect data properties. Moreover, we provide report API to show aggregated results of analyzers .

1. Analyzer

Currently support analyzers:

  • Summary : Statistical indicators, currently support numbers, mean, variance, minimum, 25% median, 50% median, 75% median, maximum value, missing percentage, stationarity p value.

  • Max : Compute maximum values of given columns.

  • FFT : Frequency domain analysis of signal based on fast Fourier transform.

  • STFT : Time-frequency analysis of signal based on short-time Fourier transform.

  • CWT : Time-frequency analysis of signal based on continuous wavelet transform.

The following code snippet shows how to apply analyzers on a TSDataset object. We use the UNI_WTH dataset as a sample, which is a univariate dataset containing weather from 2010 to 2014, where WetBulbCelsuis represents the wet bulb temperature.

from paddlets.datasets.repository import get_dataset
from paddlets.analysis import Summary
tsdataset = get_dataset('UNI_WTH')
tsdataset.summary()
sum = Summary()
sum(tsdataset)

#         WetBulbCelsius
#missing        0.000000
#count      35064.000000
#mean           1.026081
#std            6.898354
#min          -26.400000
#25%           -3.800000
#50%            0.600000
#75%            6.600000
#max           16.300000

Note that base analyzers can be invoked by TSdataset directly:

Currently support base analyzers: Summary , Max .

from paddlets.datasets.repository import get_dataset
tsdataset = get_dataset('UNI_WTH')
tsdataset.summary()

1. Analysis Report

Analysis Report is designed to show aggragated analysis results in the form of a report. Three examples to get a analysis report are demonstrated below :

2.1 Default Analysis Report

from paddlets.analysis import AnalysisReport
from paddlets.datasets.repository import get_dataset
tsdataset = get_dataset('UNI_WTH')
report = AnalysisReport(tsdataset)
# export a file named "analysis_report.docx" to current path by default
report.export_docx_report()

2.2 Customized Analyzers Report with Default Config

from paddlets.analysis import AnalysisReport
from paddlets.datasets.repository import get_dataset
tsdataset = get_dataset('UNI_WTH')
report = AnalysisReport(tsdataset, ["summary","fft"])
# export a file named "analysis_report.docx" to current path by default
report.export_docx_report()

2.3 Customized Analyzers Report with Customized Config

from paddlets.analysis import AnalysisReport
from paddlets.datasets.repository import get_dataset
tsdataset = get_dataset('UNI_WTH')
customized_config = {"fft":{
                 "norm":False,
                 "fs":1
                  }
         }
report = AalysisReport(tsdataset, ["summary","fft"], customized_config)
# export a file named "analysis_report.docx" to current path by default
report.export_docx_report()