Monitoring sunspots consistently is the most basic step required to study various aspects of solar activity. To achieve this goal, the observers must regularly calculate their own correction factor
and keep it stable. Relatively recently, two observing teams in South Korea have presented interesting papers which claim that revisions that take the yearly-basis
into account lead to a better agreement with the international relative sunspot number
, and that yearly
apparently varies with the solar cycle. In this paper, using artificial data sets we have modeled the sunspot numbers as a superposition of random noise and a slowly varying background function, and attempted to investigate whether the variation in the correction factor is coupled with the solar cycle. Regardless of the statistical distributions of the random noise, we have found the correction factor increases as sunspot numbers increase, as claimed in the reports mentioned above. The degree of dependence of correction factor
on the sunspot number is subject to the signal-to-noise ratio. Therefore, we conclude that apparent dependence of the value of the correction factor
on the phase of the solar cycle is not due to a physical property, but a statistical property of the data.
The sunspot number provides the longest available record of solar activity. Although its short-term variations give some important insight, e.g. regarding solar differential rotation, its long-term behavior and the longevity of its series may provide many more such insights. This is why the international scientific community has continuously renewed its interest in the sunspot number. Many scientific investigations, including cycle analysis and forecast, solar North-South asymmetry analysis, coherence analysis of the solar magnetic field, mid-term studies of solar activity, are based on the Solar Influences Data Analysis Center (SIDC) sunspot data (Pulkkinen et al. 1999, Lockwood 2003, Solanki et al. 2004, Wang 2004, Chang 2007, Usoskin 2008, Petrovay 2010, Ternullo 2010, Kim & Chang 2011).
The SIDC (Berghmans et al. 2005) collects monthly observations from various stations worldwide in order to calculate the International Relative Sunspot Number
(Letfus 2000, Clette et al. 2007, Vaquero 2007). The center broadcasts the daily, monthly, and yearly sunspot numbers, with middle-range predictions (up to 12 months). The Sunspot Index Data Center was founded in 1981 to continue the work, when the Zürich Observatory decided to halt computing and publishing the sunspot number. In 1981, the Sunspot Index Data Center began producing a sunspot index, called the International Relative Sunspot Number,
. Continuity and coherence with the former index of Zürich was assured through the use of Locarno as a reference station. In 2000, reorganized and expanded by gaining the qualification of Regional Warning Center of the International Space Environment Service, the Sunspot Index Data Center had become the SIDC. The bulletins and reports of the SIDC are all freely available for the scientific community and the general public on the internet (
In 1849, Wolf at the Zürich Observatory proposed the widely used formula:
is the number of individual sunspots, and
is the number of sunspot groups (Letfus 2000, Clette et al. 2007, Vaquero 2007). The quality factor
was introduced later in order to compare results from different observers, the stability of the Earth’s atmosphere and telescopes, giving the formula
). Observers must satisfy several criteria to be included in the SIDC network: dedication (more than 10 observations per month), regularity (no missing months) and consistency. It is the last that is quantified through the quality factor
, which is the correction factor between the raw sunspot number of an individual station and the global network average. The correction factor
, which typically has a value between 0.4 and 1.7, ensures that results can be compared with each other. The coefficient of the reference station Locarno is fixed at the value of
= 0.6. The task of the SIDC consists of collecting the observations from as many stations as possible worldwide, determining the appropriate
factor for each of them, and extracting an overall
from all these observations in a good statistical sense.
Let us here briefly describe the statistical processing of input data coming from the worldwide observing network (~90 stations located in ~30 different countries). The
processing consists of two main steps. In the first step, the daily reduction coefficients to Locarno are calculated and monthly averaged for every station. For each station, daily values deviating by more than 2σ from the monthly station average are eliminated. The monthly averages are recomputed iteratively until
values are consistent for all stations. In the second step, using the updated monthly
value is computed for each station, and network averages
and standard deviations σ are computed for each day. Elimination on the basis of a 1-sigma criterion is used on newly calculated daily means, until the number of retained stations remains unchanged, or the final relative standard deviation is lower than 10%. The final result is retained as the daily
. The final quality control consists essentially of regular comparisons between the sunspot number
on one hand, and an average of about 20 selected good stations (including the Locarno reference station) or the 10.7 cm radio flux on the other.
Recently, Oh & Chang (2012) have reported results of sunspot observations at the ButterStar observatory for 3364 days, from the 16
of October in 2002 to the 31
of December in 2011. By applying the linear least-squares method between the observed sunspot number (
) and the International Relative Sunspot Number (
), the overall correction factor
for the entire observing period was found to be 0.9519 with a standard deviation of 0.006. In addition, they attempted the same procedures in each year from 2002 to 2011. When calculated for each year, the yearly correction factor has slightly different
values from year to year, and furthermore shows a trend of changing along the solar cycle. That is, the yearly correction factor
is larger during the solar maxima and smaller during the solar minima, in general. It was then considered that it seems possible to reduce the errors in indexing sunspot numbers a little bit further by determining the correction factors year by year. Interestingly, similar conclusions were drawn by Kim et al. (2003), in which the observed data at the Korea Astronomy and Space Science Institute (KASI) were analyzed.
In this paper, we attempt to tackle the question of whether the variation in the correction factor is really coupled with the solar cycle . We also investigate the question of whether this apparent dependency is due to a possibility of lower statistical confidence in the linear least-squares method with a decreasing number of sunspots. If the apparent dependency is indeed due to an artifact in the statistical treatment, researchers should refrain from correcting the observed sunspot numbers
on a yearly basis, even though it seems to result in a better agreement. This paper is organized as follows. In Section 2, we briefly introduce the artificial data set used in the present analysis. The results we have obtained are presented in Section 3. Finally, we discuss our results and make some conclusions in Section 4.
2. ARTIFICIAL DATA
We model the sunspot number with random noise superposed on a slowly varying background to characterize the sunspot numbers, as per Chang (2008). The underlying part of the sunspot number data is assumed to represent a sum of undamped oscillators. Thus, the international relative sunspot number
is assumed to be approximated by
represents the solar cycle frequency, ? is the phase shift, and ε is random noise. The value of
can be chosen arbitrarily as long as the resulting function resembles the observed sunspot number data. Throughout the paper, ω
corresponds to 11 years. We employ two kinds of noise distributions for ε in the current paper: 1) uniform distribution, with which random numbers are distributed uniformly between 0 and 1, and 2) exponential distribution with a unit mean and deviation. It is observed that the multiplicative random noise reproduces observational features more satisfactorily (Chang 2008). This is why we adopt the multiplicative random noise rather than the additive random noise.
To simulate calculating the correction factor
, we need to generate a time series of the observed sunspot number
represents the random error which is assumed to obey the uniform distribution between 0 and 1, and α is a measure of signal-to-noise ratio. Other symbols have the same meanings as in Eq. (1). We consider ε
occurs due to different degrees of the skill of various observers, the atmospheric stability of the observing site, the capacity of telescopes, and so on. One may also consider α as
whose only difference is the negligible term, αε
ε. We take this particular form of multiplicative noise since it agrees with the observed features quite well (Chang 2008). That is, in the period with a large number of sunspots, uncertainties are relatively large due to the large number of sunspots.
, we show the monthly average of modeled sunspot numbers generated with the exponentially distributed random noise, ε, as a function of time in a month. The abscissa is time in months elapsed since a solar cycle begins, and the ordinate is measured in an arbitrary unit. The solid line represents
generated by Eq. (1). The dotted and the dashed lines correspond to
generated by Eq. (2) with α = 0.1 and α = 1.0, respectively. In
, similarly, we show the monthly average of sunspot numbers generated with the uniformly distributed random noise, ε. The solid line represents
generated by Eq. (1). The dotted and the dashed lines correspond to
generated by Eq. (2) with α = 0.1 and α = 1.0, respectively. Comparing the resulting daily data sets in
, as concluded in Chang (2008), the exponential noise seems to reproduce the observational features reasonably well.
, as an example, we show the observed daily number of sunspots (
for a whole period assuming the exponential random noise with α = 0.1. We attempt to apply the linear least-squares method to the relationship between daily
to calculate the correction
Monthly average of modeled sunspot numbers generated with the exponentially distributed random noise, ε, as a function of time in a month. The solid line represents Ri generated by Eq. (1). The dotted and the dashed lines correspond to RB generated by Eq. (2) with α = 0.1 and α = 1.0, respectively.
Similar plots as Fig. 1, but with the uniformly distributed random noise, ε. The solid line represents Ri generated by Eq. (1). The dotted and the dashed lines correspond to RB generated by Eq. (2) with α = 0.1 and α = 1.0, respectively.
As an example, relationship between the daily observed number of sunspots (RB) and Ri for a whole period assuming the exponential random noise with α = 0.1.
Resulting yearly kb values as a function of year. Four cases of two random noise distributions with two different signal-to-noise ratios are indicated at the upper left corner in each panel. Note that the scale in every panel is all the same. The typical uncertainties in determining the slope is the order of 0.01 and 0.001 for α = 1.0 and α = 0.1, respectively. These values are more or less same for the exponential distribution and for the uniform distribution.
in each year, after chopping the whole data set into 11 yearly data subsets. In
, we show the resulting yearly
values as a function of years since the beginning of the solar cycle. Note that the scale is the same in every panel. Four cases of two random noise distributions with two different signal-to-noise ratios are indicated at the upper left corner in each panel. Resulting correction factors vary from year to year exactly, as was noticed by Oh & Chang (2012). That is, regardless of the noise distribution, we observe that the correction factor increases as the sunspot numbers increase. The typical uncertainties in determining the slope is the order of 0.01 and 0.001 for α = 1.0 and α = 0.1, respectively. These values are more or less the same for the exponential distribution and for the uniform distribution. The degree of the dependence is subject to the signal-to-noise ratio. Therefore, we consider that the apparent dependence of the value of correction factor
on the phase of the solar cycle is not due to a physical mechanism, but is a statistical property of the data. For instance, one may consider the dependence of the correction factor on the phase of the solar cycle since the numbers of large and small sunspots are different, so that the sensitivity, or the detectability of small sunspots, may be the cause of the dependence. If this is the case, as the solar cycle progresses, the size of sunspots could be considered as modulated, and as such the mechanism of sunspot formation under the solar surface is varying in time. What we show here is that this apparent dependence does not require such a complicated mechanism, but can be explained as a statistical property of the observed data. That is, a low number of sunspots results in a lower correlation coefficient in the
relationship. Furthermore, this effect turns out to be more serious when the noise level is high. When α is large, the resulting scatter (as shown in
) gets broader, and as a result the least-squares fit is dominated by noise.
4. DISCUSSION AND CONCLUSIONS
Monitoring sunspots regularly and consistently is an important and basic step in studying various aspects of solar activity. Observers must calculate their own correction factor
regularly and to keep it stable. In South Korea, sunspot observations have been performed at the Korea Astronomy and Space Science Institute (KASI) since 1987 (Sim et al. 1990, Kim et al. 2003) and at the ButterStar Observatory since 2002 (Oh & Chang 2012). All researchers have reported that revisions taking the yearly-basis
into account lead to results that agree better with
, and that yearly
apparently varies with the solar cycle. In this paper, using artificial data sets we have modeled, we attempt to investigate whether the variation in the correction factor is really coupled with the solar cycle. We have found that the apparent dependency is due to the statistical property of the data sets. Regardless of statistical distributions of the random noise, we observe the correction factor increases as the sunspot numbers increase. The degree of dependence is also subject to the signal-to-noise ratio. Therefore, we conclude that the apparent dependence of values of the correction factor
on the phase of the solar cycle is not due to a physical mechanism, but is a statistical property of the data.
Authors are grateful to all the past and present members of the ButterStar Observatory for their hard work and dedication, and also thank the anonymous referees for critical comments and helpful suggestions which greatly improve the original version of the manuscript. HYC was supported by the National Research Foundation of Korea Grant funded by the Korean government (NRF-2011-0008123).
van der Linden RAM
Solar activity: nowcasting and forecasting at the SIDC.
DOI : 10.5194/angeo-23-3115-2005
van der Linden RAM
From the Wolf number to the International Sunspot Index: 25 years of SIDC.
DOI : 10.1016/j.asr.2006.12.045
The relative sunspot numbers from 1987 to 2002.
Relative sunspot numbers in the first half of eighteenth century.
Relative sunspot number observed from 2002 to 2011 at Butterstar observatory.
Solar cycle prediction.
Long-term variation of sunspot latitudes.
Analysis of the observational data of sunspots.
Unusual activity of the Sun during recent decades compared to the previous 11,000 years.
DOI : 10.1038/nature02995
A history of solar activity over millennia.