Tucson format

From Cybis Wiki
Revision as of 11:25, 11 January 2017 by Lars-Ake (talk | contribs) (→‎Unexpected examples)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Tucson format or decadal format or rwl format is one of the most common formats for storing ring width data. It is the standard format for ITRDB. It is a text file format. Different extensions are used, such as .rwl, .crn, .tuc and .dec. (.crn is used for derivate chronologies with not exactly the same syntax). The name comes from the city of Tucson in Arizona.

Basics of the Tucson format for ring width data

A Tucson file usually consists of three lines of meta data followed by an undefined number of data lines. Ring width data is written as integers either in units of 0.01 mm or in units of 0.001 mm. A data line consist of the core identity (max 8 alphanumeric characters, i.e. letters or digits), the year of the oldest measurement of the line (4 digits) followed by ring width data, up to ten rings per line. Except for the first and last lines of each core, there are always measurements for one full decade per line.

After the youngest ring there is a stop marker as an extra value. The stop marker depends on the resolution used:

  • When using 0.01 mm as the unit of measure, the stop marker is "999"
  • When using 0.001 mm as the unit of measure, the stop marker is "-9999"

I.e. the stop marker is used to show not only the end of the series but also the units of measurement used (0.01 mm or 0.001 mm)!

The missing data mark is actually not defined, though usually the mark "-999" is used within 0.01 mm files, and "0" in 0.001 mm files. See also Note 4 below!


Note 1: The consequence of using the value 999 as a stop marker is that a measurement of 9.99 mm, in 0.01 mm units, has to be changed into 9.98 mm (written value=998) or 10.00 mm (1000)! To avoid fooling any other software, it is probably best never to write the value "999" as measurement data also for files with 0.001 mm units!
Note 2: If 5 digits are needed for the year number, i.e. a core older than -999, the identity cannot be more than 7 alphanumeric characters.
Note 3: An ITRDB specification of the Tucson format is available [1], though that specification cannot be fully trusted. E.g. it currently (2009-07-09) specifies the missing data mark to be 999, which value in reality is used as the stop marker.
Note 4: It seems that Cofecha is handling these two marks differently: When Cofecha runs over a -999 value, then the series is broken into two series by this mark. While a "0" value seems to be handled as an "absent" value which is just omitted from correlations. CDendro from version 8.2 will have a setting to make missing values marked as "0" also in .rwl files having 0.01 mm measurement units. You can attain the same effect of unbroken series by changing the Decadal measurement unit to 0.001 mm in CDendro and again write out your series to a .rwl file before submitting it to Cofecha.

PMkr12b 1781   120    87    69   122   108    85   125   114    77
PMkr12b 1790   134   131   114    97   117    49    69   100   123    89
PMkr12b 1800   137    89  -999    79    44    38    62    99    68    26
PMkr12b 1810    27    43    51    57    36   999

An example of a sample saved in 0.01 mm units, which covers the timespan AD 1781-1814, with a missing ring for AD 1802 (-999). The width of AD 1781 (the first year) annual ring is 1.20 mm and for 1782 AD 0.87 mm.

PMkr12b 1781  1200   870   690  1220  1080   850  1250  1140   770
PMkr12b 1790  1340  1310  1140   970  1170   490   690  1000  1230   890
PMkr12b 1800  1370   890     0   790   440   380   620   990   680   260
PMkr12b 1810   270   430   510   570   360 -9999

The same sample written in 0.001 mm units. Note the missing data mark which is here "0".


6682    1980   143   231   154   145   150   201   130   156   245   137
6682    1990   141   202   120    96   999
NM002   1632    90    92    91   174    84    45   185   111
NM002   1640   116    72    91    49    85   146   125   126   136   131

The usual ending and start of samples (0.01 mm units resolution)

Unexpected examples

The Tucson format standard is sometimes interpreted in ways that will make the programming a bit hard. The following examples are taken from files in ITRDB but also from other sources.

SH387C  1170    14    16    14    19    22    22    26    16    23    23
SH387C  1180    17    11    14    12   999     0     0     0     0     0
SH387D  1078    48    48
SH387D  1080    50    42    46    62    49    53    41    28    17    31

An example from brit9.rwl[2] where the positions after the end mark are filled out with "0"


Q 9730   990    72    98   112   124   107   132   137   145   114    80

This snappet from brit045.rwl[3] looks very much normal, but ends with two Asciichar(13) characters which will not be trimmed away by the Visual Basic Trim function.


WRU9    1190   190   192   218   213   204   259   206   150   178   149
WRU9    1200   198   232   151   199   175   196  9990  9990  9990  9990
WRU13   1075  9990  9990  9990  9990  9990   342   426   240   213   217

A snappet from brit5.rwl.[4] It both ends and start a sample with 9990 markers.


MWK964  1970    16    11    22    25     9    13    26    24    23    16        
MWK964  1980   999                                                              
MWK965   509    62     0     0     0     0     0     0     0     0     0        
MWK965   510    47    45    25    19    33    24    32    51    24    22        
...
MWK401 -3550    26    21    19    20    28    21    13    11    -0    11  

Example from ca535.rwl[5] where zeroes are filled into positions which are not in use and this with a -0 instead of 0 or -999 for missing rings. Each row also ends with a number of extra white spaces

BG8     1920   324   693   690   700  1130   847   980  1143   680   566
BG8     1930   548   671   907  1197   902  1169   969   862  1051  1125
BG8     1940   549   361   496   767   782   168    38    -7    -7    -7
BG8     1950   102   119    31    94   110    79   118   240   245   251

Example where "-7" has been used to indicate a missing ring.
A comment from a CDendro user: "To give an example of when negative values are used with a particular meaning, it is possible to fill in missing data in ARSTAN. The relevant setting fills in missing data if the data points are denoted with a negative number (excluding -9999). Data points with zero values do not get filled in."


606 13  1570    24    31    30    25    26    24    27    27    33    30
606 13  1580    20   999
606 13  1586    20    19    19    18
606 13  1590    27    20    20    25    22    22    23    23    10    15

Example from fran009.rwl.[6] I.e. two segments with a small gap of missing rings in between is written in the same way as two separate samples though here with the same identity.

Note, that there exist also .rwl files of the type above but with several other samples written between the segments. See e.g. Itrdb germ011.rwl where the identity 371241 starts the collection with a segment, then many other members follow and at last still a segment of 371241 ends the collection.

In CDendro these segments are handled as separate samples though they have the same identity within the .rwl file. The identity problem is then solved by giving them a temporary identity like "606 13:1" and "606 13:2"


638003  1450   148   101   114    93    81    96   100    95    79    57
638003  1460    78   119    78    86    89    64    89    89   101    99
638003  1470    94   192   172   143    81   103    91   100   122   999
638003     0     0     0     0     0     0     0     0     0     0     0
638003    10     0     0     0     0     0     0     0     0     0     0
638003    20     0     0     0     0     0     0     0     0     0     0
638003    30     0     0     0     0     0     0     0     0     0     0

Example from swit177w.rwl. [7] This is currently detected as a "jump in years" error by CDendro. Then CDendro supposed that the 999 meant a missing ring, though we could as well say that if a 999 is followed by a not anticipated year number, then that data section should be handled as a new member (a new sample) even when the member identity is not changed.


OMA0851A1623   232   126   216   213   157   258   263     .     .     .
OMA0851A1630   246   182   174   187   240   175   169   173   149   150
...
OMA0851A1810    29    33    30    31    30    41    38    32    52    65
OMA0851A1820    47    57    55    40   999     .     .     .     .     .
OMA0851B1623   216   142   209   181   207   273   265     .     .     .
OMA0851B1630   189   172   162   153   155   145   147   150   157   155
OMA0851B1640   166   138   132   105    99   107   129    69    78    86
OMA0851B1650   140   156    60    47    35    75    26    41    50    37
...
OMA0851B1840    14    15    15    12    12    11    13    10    11    11
OMA0851B1850     6     8   999     .     .     .     .     .     .     .
OMA0852A1692   140   161   151   144    97   122   149   160     .     .
OMA0852A1700   222   237   251   153   185   191   234   293   189   159
OMA0852A1710   213   182   174   213   114   136   143   129   170   130
...

Example from the Finnish Tree ring data bank of Saima With this layout each "unused position" is marked with a dot (.).


109540  1690    48    45    45    39    39    33    33    36    33    33
109540  1700    36    30    33    27    36    39    36    39    36    39
109540  1710    39    33    33    30    33    30  22.5  22.5  25.5  25.5
109540  1720  25.5  31.5  34.5    30  37.5    45    39  37.5  40.5    36
109540  1730    30  31.5  31.5  34.5    30    30  34.5  34.5    33    36
109540  1740  31.5    30  25.5    27    30    30    24    30   999
109550  1521   138   129   114    99    99   189   120   114   111
109550  1530   111   105    99   102   105    96    90   102    81    93

A snappet from germ21.rwl.[8] This is the only case we have found with decimal values (like "22.5").
Note: The germ21.rwl file contains another severe error (a missing line of data) that makes it unreadable to CDendro.


cc321201   0     0  3934  2736  2620  4184  4334  4438  4790  4968  5330 wb       
cc321201  10  5008  1796  4786  3940  2762  1870  1804  1912  1588  1874 wb       
cc321201  20  2634  1840  1994  2120  3334  1252  1206  1362  4186  1938 wb       
cc321201  30  2090   530  1594  2492  1686  2064   518  2854  1426  1652 wb       
cc321201  40   774  1080  1130  1248 -9999                               wb       
cc321202   0     0  5022  4404  5402  4176  5676  2666  6004  5212  4262 wb       
cc321202  10  2206  2928  1894  2458  4806  2846  2880  2354  2318  2550 wb       
cc321202  20  2028  2178  2054  2230  2492  1756  1892  2730  1948  1922 wb       
cc321202  30  1334  2016  1386  2154  1708  4396  1336  1742   502  2868 wb       
cc321202  40 -9999                                                       wb       

From measurements performed on a Velmex machine and saved as time series in a decadal format
(The " wb" word at end of each line made CDendro versions before 8.1.2 reject the data.)


Tucson data created from old Catras files
S00400791970   231   122   124   128   101   118   117   127   120   142
S00400791980   106   117   124   144   122   117   117    99   131   115
S00400791990   121    90 -9999   999
S00400891828 -9999   325
S00400891830   318   211   228   236   227   345   300   350   298   287
S00400891840   222   208   262   259   221   216   255   327   309   255
....
S00400891970   185   172   211   200   143   208   203   208   185   206
S00400891980   183   240   191   221   194   197   156   156   200   198
S00400891990   219   211 -9999   999
S00400991772 -9999 -9999 -9999 -9999   162   103   133   131
S00400991780   101   141   237   233   207   132   121   222   198   206
S00400991790   226   271   387   384   273   309   273   224   269   313
S00400991800   230   246   290   233   274   199   145   162   180   213

Example created by Convert5 from files available in .cat-format in the ITRDB, christensen_denmark_oak/sverige/ (When CDendro opens such Catras files, all -9999 data is automatically removed.)


Tab- or Space-characters as field delimiters

There are Tucson alike files with a tab-character as a delimiter between the fields.
Also one or more space characters occur as a field delimiter, especially when a Tucson file has been read in from a printed document or a .pdf file

CDendro interpretation

Using comments

When various dendrochronology programs are described as e.g. Cofecha and Arstan, it is often noted that lines that cannot be interpreted as ring width data lines are considered as comments. That feature is used by CDendro. Comments may then be bound both to the .rwl file (the sample collection) itself and to individual members (samples) of the collection.

SN     1 Saltsjobaden                                        PISY
SN     2 SWEDEN       Scots pine               5917N1818E          1696 2005
SN     3 Lars-Ake Larsson
SN      #### Samples taken from living or fallen trees except the group SNKBxx, which are poles of an old pier
SN      #### (Kolbryggan) in the bay Palnasviken, which have been standing in clay for a 110 years.
SN      #### The sample SNSU01 is taken from the Skutudden cottage.
SN001A  1923   341   374   369   298   500   382   396
SN001A  1930   332   297   421   250   290   288   317   320   256   215
SN001A  1940   111   183   229   183   159   157   163   134   105   111
SN001A  1950    81    62    89   138   164   180   138   157   170   130
SN001A  1960   108   184   137   148   124   164    80    98   105    67
SN001A  1970    57    89   126   101   114   100    83    76    71    57
SN001A  1980   114    87    79    85    56    49    61    79    81    70
SN001A  1990    99   139   132   161   132   118   999
SN001A  #### You may store comments for a sample too
SN001A  #### and it may extend over several lines.
SN001B  1923   329   375   319   299   435   366   384
SN001B  1930   287   258   392   280   251   293   296   278   200   181
SN001B  1940   114   154   224   160   138   106   127   126    97    96
SN001B  1950    68    65    83   114   168   176    91   163   189   146

Example of .rwl file with comments created with CDendro

The CDendro .rwc format for specifying the proportion of Latewood

In CooRecorder there are provisions for measuring latewood and earlywood. As a .rwl file cannot contain both latewood and earlywood, CDendro has a .rwc format which is almost the same as the .rwl format, though the latewood is specified as a sequence of permillage values at the end of the normal ring width lines:

SN     1 Saltsjobaden                                        PISY
SN     2 SWEDEN       Scots pine               5917N1818E          1733 2003
SN     3 Lars-Ake Larsson
SN001A  1923   341   374   369   298   500   382   396                   #   79  131   63  179  143  288  139
SN001A  1930   332   297   421   250   290   288   317   320   256   215 #  245  190  300   76  201  267  288  326  281  230
SN001A  1940   111   183   229   183   159   157   163   134   105   111 #  101  160  210  214  151  243  173   49  123  139
SN001A  1950    81    62    89   138   164   180   138   157   170   130 #   55   47  100  189  283   25  167  293  350   26
SN001A  1960   108   184   137   148   124   164    80    98   105    67 #  392  377  195  209   99  280  108  187  195  108
SN001A  1970    57    89   126   101   114   100    83    76    71    57 #  247  315  209  105  331  152   98  248  263  301
SN001A  1980   114    87    79    85    56    49    61    79    81    70 #  412  199  253   77  203  190  146  215  391  206
SN001A  1990    99   139   132   161   132   118   999                   #  518  345  278  369  229  163
SN001B  1923   329   375   319   299   435   366   384                   #   99  122   73  135  146  223  139
SN001B  1930   287   258   392   280   251   293   296   278   200   181 #  239  126  150  126  232  178  183  326  363  228

The ring width value of 1923 is 3.41 mm with the latewood being 0.079*3.41=0.27 mm.


Meta data

Meta data that is collected from e.g. a Heidelberg file can be written by CDendro to a .rwl file:

...
1AD0046A1500    54    64    64    67    64    54    61    71    78    81
1AD0046A1510    84    94   105    84   108   108   115    88   135   108
1AD0046A1520   111    98   121    94   138   118   111    94   108   105
1AD0046A1530   999
1AD0046A#### Location=Stockholm;  Species=PISY;  TreeNo=0;
1AD0046A#### CoreNo=0;  Project=123;

Naming standard

See CDendro naming standard

Limitations of the Tucson format

  • The amount of meta data is limited to what is specified for the three first lines of a .rwl file. Then we should also be aware that the syntax of that meta data is not very specified, i.e. in practice it will be almost free text.
  • If meta data is stored as comments with a special syntax, then we have to recognize that there is no common specification for how that meta data should be named and specified.
  • There is no specification on how Latewood and Earlywood should be stored within the same .rwl file. Within the ITRDB Latewood and Earlywood data is saved in separate .rwl files.
  • There is no specification on e.g. a naming standard to allow for keeping radii from the same stem together, though see also CDendro naming standard

The Tucson chronology .crn format

Citation from http://www.ncdc.noaa.gov/paleo/treeinfo.html

Processed Data Files (Site Chronologies, File Extension .CRN)

These are the standardized tree-growth indices from a stand of trees, representing the mean growth observed for each year
over the entire stand. Site chronologies are used in climate analysis. Data are stored as 3 or 4-digit numbers, with a value 
of 1000 representing mean growth, a minimum value of 0 (no growth), and no defined maximum. There is only one time series per 
file, in contrast to the raw data files. Missing value code is 9990. Site information is stored in the first 3 records of the file.

Format for chronology header records:

Record #1: 1-6 Site ID, 10-61 Site Name, 62-65 Species Code, optional ID#'s
Record #2: 1-6 Site ID, 10-22 State/Country, 23-30 Species, 41-45 Elevation, 48-57 Lat-Long, 68-76 1st & last Year
Note: lat-lons are in degrees and minutes, ddmm or dddmm
Record #3: 1-6 Site ID, 10-72 Lead Investigator, 73-80 comp. date

Chronology Data, Records 4-??

Site ID# column 1-6
Decade column 7-10
Index Value-Sample Number* pairs of values, columns 11-80, 10(I4+I3)
TRL ID#(optional) column 82-88

*Index Values, columns 11-14,18-21,25-28,32-35,etc
# of samples used in calculating chronology, columns 15-17,22-24,29-31,36-38,etc.
Example:1450 670 171018 17 897 18...
Here, 670 is the ring-width index value for the year 1450, with a sample size of 17;
1018 is the ring-width index value for the year 1451, with a sample size of 17;
897 is the ring-width index value for the year 1452, with a sample size of 18


Chronology Statistics, Last Record, Optional:
Site ID# column 1-6
Number of Years column 8-10
First Order Autocorrelation column 13-16
Standard Deviation column 19-22
Mean Sensitivity column 25-28
Mean Index Value column 29-35
Sum of Indices column 37-44
Sum of Squares of Indices column 46-53
Max# of series column 62-63

Example:

ANEBY  1 Anebymossen                                         PISY               
ANEBY  2 Sweden       Scots Pine         295M  5751 01438    __    1846 1996    
ANEBY  3 Hans Linderholm                                                        
ANEBYS18469990  09990  09990  09990  09990  09990  0 798  11495  1 779  1 976  1
ANEBYS1850 869  1 984  11116  1 881  2 936  2 843  2 394  2 613  21538  21406  2
ANEBYS18601466  2 967  21468  3 735  3 701  31041  3 776  31116  31271  3 804  3
ANEBYS18701270  3 821  31034  4 772  4 818  41011  41067  41052  41073  61195  6
ANEBYS18801523  71086  8 926  8 771  9 986  9 950  9 992  9 807  9 908  91041  9
ANEBYS18901169  9 920  91274  91259  91074 10 974 101198 101208 111371 121022 12
ANEBYS1900 774 121071 12 755 12 851 12 721 12 730 12 814 12 758 12 642 12 727 12
ANEBYS19101101 12 987 13 830 14 731 141001 141022 14 991 14 701 14 992 141047 14
ANEBYS1920 973 141158 141149 141212 14 960 16 978 17 992 17 957 17 660 17 498 17
ANEBYS1930 539 17 679 17 774 17 903 171183 171261 171490 171603 181441 181407 19
ANEBYS19401087 191163 191048 20 981 20 900 201398 201485 201044 21 900 211003 21
ANEBYS1950 974 21 962 21 836 21 975 21 881 21 891 21 822 21 967 211093 21 905 21
ANEBYS1960 702 21 843 21 803 21 829 21 901 21 900 21 908 211256 211256 211122 21
ANEBYS1970 846 21 852 211198 211274 211026 211320 211507 211178 211384 211313 21
ANEBYS19801232 211075 21 861 21 977 21 984 21 945 21 956 21 922 21 779 21 819 21
ANEBYS1990 703 21 889 21 774 21 892 21 939 211097 21 820 219990  09990  09990  0

Example from swed314.crn[9] showing the structure of a chronology file.

CDendro interpretation

CDendro allows for storing extra comments in the header not only of ordinary .rwl files but also in .crn files. Example:

NORRW  1 Norrland wood                                       PISY
NORRW  2 Sweden       Scots pine               6300N1800E          1513 1874
NORRW  3 Lars-Ake Larsson
#Based on 
#A. samples from boards (outside wooden panel) of the Sandviken house, Vastanvik, Namdo.
#B. a sample from a board in an old barn at Langholmen, Runmaro, an island east of Stockholm.
#C. a house built in 1907 in Saltsjobaden (Palnasv 1)
#The wood obviously seems to be imported from long way north of the Stockholm area.
#CDENDRO_DC This chronology is intended for crossdating purpose only!  Standardization: NegExpDetrend SumByStem=Yes
NorTrd15139990  09990  09990  01544  11752  11690  11225  11713  11345  11239  1
NorTrd1520 823  11235  11489  11208  1 834  11265  11050  11416  11923  12001  1
NorTrd15301797  12041  12305  12632  13319  13204  12575  13262  12438  12525  1
NorTrd15401932  12727  12650  11674  21748  21832  21842  21856  21438  21350  2
NorTrd15501305  21713  21395  21572  21233  21572  21531  21439  21546  31482  3
......

See also

See also Dendro data format

Notes