Line data Source code
1 : !> \file mo_standard_score.f90
2 : !> \brief \copybrief mo_standard_score
3 : !> \details \copydetails mo_standard_score
4 :
5 : !> \brief Routines for calculating the normalization (anomaly)/standard score/z score and the
6 : !! deseasonalized (standard score on monthly basis) values of a time series.
7 : !> \details In environmental research often the centralization and standardization are estimated
8 : !! for characterizing the dynamics of a signal.
9 : !> \author Matthias Zink
10 : !> \date May 2015
11 : !> \copyright Copyright 2005-\today, the CHS Developers, Sabine Attinger: All rights reserved.
12 : !! FORCES is released under the LGPLv3+ license \license_note
13 : MODULE mo_standard_score
14 :
15 : USE mo_kind, ONLY : i4, sp, dp
16 :
17 : IMPLICIT NONE
18 :
19 : PUBLIC :: standard_score ! standard score of a population
20 : PUBLIC :: classified_standard_score ! standard score for classes of a population (e.g. classes=months)
21 :
22 : ! ------------------------------------------------------------------
23 :
24 : ! NAME
25 : ! standard_score
26 :
27 : ! PURPOSE
28 : !> \brief Calculates the standard score / normalization (anomaly) / z-score.
29 : !> \details In statistics, the standard score is the (signed) number of standard deviations an observation
30 : !> or datum is above the mean. Thus, a positive standard score indicates a datum above the mean,
31 : !> while a negative standard score indicates a datum below the mean.
32 : !> It is a dimensionless quantity obtained by subtracting the population mean from
33 : !> an individual raw score and then dividing the difference by the population standard deviation.
34 : !> This conversion process is called standardizing or normalizing (however, "normalizing" can
35 : !> refer to many types of ratios).\n
36 : !> Standard scores are also called z-values, z-scores, normal scores, and standardized variables; the use
37 : !> of "Z" is because the normal distribution is also known as the "Z distribution". They are most frequently
38 : !> used to compare a sample to a standard normal deviate, though they can be defined without assumptions of
39 : !> normality (Wikipedia, May 2015).
40 : !>
41 : !> \f[ standard\_score = \frac{x - \mu_x}{\sigma_x} \f]
42 : !> where \f$ \mu_x \f$ is the mean of a population \f$ x \f$ and \f$ \sigma_x \f$ its standard deviation.
43 : !>
44 : !> If an optinal mask is given, the calculations are over those locations that correspond to true values in the mask.
45 : !> data can be single or double precision. The result will have the same numerical precision.
46 :
47 : ! CALLING SEQUENCE
48 : ! out = standard_score(data, mask=mask)
49 :
50 : ! INDENT(IN)
51 : !> \param[in] "real(sp/dp), dimension(:) :: data" data to calculate the standard score for
52 :
53 : ! INDENT(INOUT)
54 : ! None
55 :
56 : ! INDENT(OUT)
57 : ! None
58 :
59 : ! INDENT(IN), OPTIONAL
60 : !> \param[in] "logical, dimension(:),optinal :: mask" indication which cells to use for calculation
61 : !> If present, only those locations in mask having true values in mask are evaluated.
62 :
63 : ! INDENT(INOUT), OPTIONAL
64 : ! None
65 :
66 : ! INDENT(OUT), OPTIONAL
67 : ! None
68 :
69 : ! RETURN
70 : !> \return real(sp/dp) :: standard_score — standard score / normalization (anomaly) / z-score
71 :
72 : ! RESTRICTIONS
73 : ! Input values must be floating points.
74 :
75 : ! EXAMPLE
76 : ! data = (/ 1., 2, 3., -999., 5., 6. /)
77 : ! out = standard_score(data, mask=(data >= 0.))
78 : ! -> see also example in test directory
79 :
80 : ! LITERATURE
81 : !> \note Richard J. Larsen and Morris L. Marx (2000) An Introduction to Mathematical Statistics and Its
82 : !> Applications, Third Edition, ISBN 0-13-922303-7. p. 282.
83 :
84 : ! HISTORY
85 : !> \author Matthias Zink
86 : !> \date May 2015
87 :
88 : INTERFACE standard_score
89 : MODULE PROCEDURE standard_score_sp, standard_score_dp
90 : END INTERFACE standard_score
91 :
92 : ! ------------------------------------------------------------------
93 :
94 : ! NAME
95 : ! classified_standard_score
96 :
97 : ! PURPOSE
98 : !> \brief Calculates the classified standard score (e.g. classes are months).
99 : !> \details In statistics, the standard score is the (signed) number of standard deviations an observation
100 : !> or datum is above the mean. Thus, a positive standard score indicates a datum above the mean,
101 : !> while a negative standard score indicates a datum below the mean.
102 : !> It is a dimensionless quantity obtained by subtracting the population mean from
103 : !> an individual raw score and then dividing the difference by the population standard deviation.
104 : !> This conversion process is called standardizing or normalizing (however, "normalizing" can
105 : !> refer to many types of ratios).\n
106 : !> Standard scores are also called z-values, z-scores, normal scores, and standardized variables; the use
107 : !> of "Z" is because the normal distribution is also known as the "Z distribution". They are most frequently
108 : !> used to compare a sample to a standard normal deviate, though they can be defined without assumptions of
109 : !> normality (Wikipedia, May 2015).\n
110 : !> In this particular case the standard score is calculated for means and standard deviations derived from
111 : !> classes of the time series. Such classes could be for example months. Thus, the output would be a
112 : !> deseasonalized time series.
113 : !>
114 : !> \f[ classified\_standard\_score = \frac{x_i - \mu_{c_{x_i}}}{\sigma_{c_{x_i}}} \f]
115 : !> where \f$ x_i \f$ is an element of class \f$ c_{x_i} \f$. \f$ x \f$ is a population, \f$ \mu_{c_{x_i}} \f$
116 : !> is the mean of all members of a class \f$ c_{x_i} \f$ and \f$ \sigma_{c_{x_i}} \f$ its standard deviation.
117 : !>
118 : !> If an optinal mask is given, the calculations are over those locations that correspond to true values in the mask.
119 : !> data can be single or double precision. The result will have the same numerical precision.
120 :
121 : ! CALLING SEQUENCE
122 : ! out = classified_standard_score(data, mask=mask)
123 :
124 : ! INDENT(IN)
125 : !> \param[in] "integer, dimension(:) :: classes" classes to categorize data (e.g. months)
126 : !> \param[in] "real(sp/dp), dimension(:) :: data" data to calculate the standard score for
127 :
128 :
129 : ! INDENT(INOUT)
130 : ! None
131 :
132 : ! INDENT(OUT)
133 : ! None
134 :
135 : ! INDENT(IN), OPTIONAL
136 : !> \param[in] "logical, dimension(:), optional :: mask" indication which cells to use for calculation
137 : !> If present, only those locations in mask having true values in mask are evaluated.
138 :
139 : ! INDENT(INOUT), OPTIONAL
140 : ! None
141 :
142 : ! INDENT(OUT), OPTIONAL
143 : ! None
144 :
145 : ! RETURN
146 : !> \return real(sp/dp) :: classified_standard_score — classified standard score (e.g. deseasonalized
147 : !> time series)
148 :
149 : ! RESTRICTIONS
150 : ! Input values must be floating points.
151 :
152 : ! EXAMPLE
153 : ! data = (/ 1., 2, 3., -999., 5., 6. /)
154 : ! classes = (/ 1, 1, 1, 2, 2 , 2 /)
155 : ! out = classified_standard_score(data, classes, mask=(data >= 0.))
156 : ! -> see also example in test directory
157 :
158 : ! LITERATURE
159 : ! None
160 :
161 : ! HISTORY
162 : !> \author Matthias Zink
163 : !> \date May 2015
164 :
165 : INTERFACE classified_standard_score
166 : MODULE PROCEDURE classified_standard_score_sp, classified_standard_score_dp
167 : END INTERFACE classified_standard_score
168 : ! ------------------------------------------------------------------
169 :
170 : PRIVATE
171 :
172 : ! ------------------------------------------------------------------
173 :
174 : CONTAINS
175 :
176 : ! ------------------------------------------------------------------
177 :
178 6 : FUNCTION standard_score_sp(data, mask)
179 :
180 : use mo_moment, only : average, stddev
181 :
182 : implicit none
183 :
184 : real(sp), dimension(:), intent(in) :: data ! data arrau input
185 : logical, dimension(:), optional, intent(in) :: mask ! optional input
186 : real(sp), dimension(size(data, dim = 1)) :: standard_score_sp
187 :
188 : ! local
189 2 : logical, dimension(size(data, dim = 1)) :: maske
190 :
191 : ! check if optional mask matches shape of data
192 2 : if (present(mask)) then
193 1 : if (size(mask) .ne. size(data)) stop '***Error: standard_score_sp: size(mask) .ne. size(data)'
194 11 : maske = mask
195 : else
196 10 : maske(:) = .true.
197 : end if
198 :
199 : ! check if enough values (>1) are available
200 20 : if (count(maske) .LE. 2) stop '***Error: standard_score_sp: less than 2 elements avaiable'
201 :
202 20 : standard_score_sp = (data(:) - average(data, mask = maske)) / stddev(data, mask = maske)
203 :
204 2 : END FUNCTION standard_score_sp
205 :
206 :
207 6 : FUNCTION standard_score_dp(data, mask)
208 :
209 2 : use mo_moment, only : average, stddev
210 :
211 : implicit none
212 :
213 : real(dp), dimension(:), intent(in) :: data ! data arrau input
214 : logical, dimension(:), optional, intent(in) :: mask ! optional input
215 : real(dp), dimension(size(data, dim = 1)) :: standard_score_dp
216 :
217 : ! local
218 2 : logical, dimension(size(data, dim = 1)) :: maske
219 :
220 : ! check if optional mask matches shape of data
221 2 : if (present(mask)) then
222 1 : if (size(mask) .ne. size(data)) stop '***Error: standard_score_dp: size(mask) .ne. size(data)'
223 11 : maske = mask
224 : else
225 10 : maske(:) = .true.
226 : end if
227 :
228 : ! check if enough values (>1) are available
229 20 : if (count(maske) .LE. 2) stop '***Error: standard_score_dp: less than 2 elements avaiable'
230 :
231 20 : standard_score_dp = (data(:) - average(data, mask = maske)) / stddev(data, mask = maske)
232 :
233 2 : END FUNCTION standard_score_dp
234 :
235 : ! ------------------------------------------------------------------
236 :
237 6 : FUNCTION classified_standard_score_sp(data, classes, mask)
238 :
239 2 : use mo_moment, only : average, stddev
240 : use mo_orderpack, only : unista
241 :
242 : implicit none
243 :
244 : real(sp), dimension(:), intent(in) :: data ! data array with input
245 : integer, dimension(:), intent(in) :: classes ! array indicateing classes
246 : logical, dimension(:), optional, intent(in) :: mask ! array masking elements of data
247 : real(sp), dimension(size(data, dim = 1)) :: classified_standard_score_sp
248 :
249 : ! local
250 : integer(i4) :: iclass, ielem ! loop variable
251 : integer(i4) :: number_of_classes ! number of unique classes in vector
252 : ! classes
253 2 : integer(i4), dimension(size(data, dim = 1)) :: unique_classes ! vector of uniqe classes
254 2 : real(sp) :: class_mean ! mean of class
255 2 : real(sp) :: class_stddev ! standard deviation of class
256 2 : logical, dimension(size(data, dim = 1)) :: maske ! data mask
257 2 : logical, dimension(size(data, dim = 1)) :: mask_class_maske ! combined mask for current class and
258 : ! maske
259 :
260 : ! check if optional mask matches shape of data
261 2 : if (present(mask)) then
262 1 : if (size(mask) .ne. size(data)) stop '***Error: classified_standard_score_sp: size(mask) .ne. size(data)'
263 11 : maske = mask
264 : else
265 10 : maske(:) = .true.
266 : end if
267 :
268 : ! check if enough values (>1) are available
269 20 : if (count(maske) .LE. 2) stop '***Error: classified_standard_score_sp: less than 2 elements avaiable'
270 :
271 : ! initialization
272 20 : classified_standard_score_sp = 0.0_sp
273 :
274 : ! write classes to new array for getting unique array elements
275 22 : unique_classes = classes
276 2 : call unista(unique_classes, number_of_classes) ! (unique arry elements in the 1:number_of_classes
277 : ! ! indexes of array unique_classes)
278 :
279 : ! loop over classes
280 8 : do iclass = 1, number_of_classes
281 : ! calculate mean and standard deviation for class
282 60 : mask_class_maske = (maske .AND. (classes==unique_classes(iclass)))
283 6 : class_mean = average(data, mask = mask_class_maske)
284 6 : class_stddev = stddev(data, mask = mask_class_maske)
285 : ! loop over array elements
286 62 : do ielem = 1, size(data, dim = 1)
287 54 : if (.NOT. mask_class_maske(ielem)) cycle ! skip masked values and other classes
288 60 : classified_standard_score_sp(ielem) = (data(ielem) - class_mean) / class_stddev
289 : end do
290 : end do
291 :
292 2 : END FUNCTION classified_standard_score_sp
293 :
294 :
295 4 : FUNCTION classified_standard_score_dp(data, classes, mask)
296 :
297 2 : use mo_moment, only : average, stddev
298 : use mo_orderpack, only : unista
299 :
300 : implicit none
301 :
302 : real(dp), dimension(:), intent(in) :: data ! data array with input
303 : integer, dimension(:), intent(in) :: classes ! array indicateing classes
304 : logical, dimension(:), optional, intent(in) :: mask ! array masking elements of data
305 : real(dp), dimension(size(data, dim = 1)) :: classified_standard_score_dp
306 :
307 : ! local
308 : integer(i4) :: iclass, ielem ! loop variable
309 : integer(i4) :: number_of_classes ! number of unique classes in vector classes
310 2 : integer(i4), dimension(size(data, dim = 1)) :: unique_classes ! vector of uniqe classes
311 2 : real(dp) :: class_mean ! mean of class
312 2 : real(dp) :: class_stddev ! standard deviation of class
313 2 : logical, dimension(size(data, dim = 1)) :: maske ! data mask
314 2 : logical, dimension(size(data, dim = 1)) :: mask_class_maske ! combined mask for current class and maske
315 :
316 : ! check if optional mask matches shape of data
317 2 : if (present(mask)) then
318 1 : if (size(mask) .ne. size(data)) stop '***Error: classified_standard_score_dp: size(mask) .ne. size(data)'
319 11 : maske = mask
320 : else
321 10 : maske(:) = .true.
322 : end if
323 :
324 : ! check if enough values (>1) are available
325 20 : if (count(maske) .LE. 2) stop '***Error: classified_standard_score_dp: less than 2 elements avaiable'
326 :
327 : ! initialization
328 20 : classified_standard_score_dp = 0.0_dp
329 :
330 : ! write classes to new array for getting unique array elements
331 22 : unique_classes = classes
332 2 : call unista(unique_classes, number_of_classes) ! (unique arry elements in the 1:number_of_classes
333 : ! ! indexes of array unique_classes)
334 :
335 : ! loop over classes
336 8 : do iclass = 1, number_of_classes
337 : ! calculate mean and standard deviation for class
338 60 : mask_class_maske = (maske .AND. (classes==unique_classes(iclass)))
339 6 : class_mean = average(data, mask = mask_class_maske)
340 6 : class_stddev = stddev(data, mask = mask_class_maske)
341 : ! loop over array elements
342 62 : do ielem = 1, size(data, dim = 1)
343 54 : if (.NOT. mask_class_maske(ielem)) cycle ! skip masked values and other classes
344 60 : classified_standard_score_dp(ielem) = (data(ielem) - class_mean) / class_stddev
345 : end do
346 : end do
347 :
348 2 : END FUNCTION classified_standard_score_dp
349 :
350 : END MODULE mo_standard_score
|