Tucson format: Difference between revisions
(→Unexpected examples: one more detail on ca535) |
No edit summary |
||
Line 77: | Line 77: | ||
MWK401 -3550 26 21 19 20 28 21 13 11 -0 11 </pre> | MWK401 -3550 26 21 19 20 28 21 13 11 -0 11 </pre> | ||
''Example from ca535.rwl<ref>{{itrdb|ca535|northamerica/usa}}</ref> where zeroes are filled into positions which are not in use and this with a -0 instead of 0 or -999 for missing rings. Each row also ends with a number of extra white spaces'' | ''Example from ca535.rwl<ref>{{itrdb|ca535|northamerica/usa}}</ref> where zeroes are filled into positions which are not in use and this with a -0 instead of 0 or -999 for missing rings. Each row also ends with a number of extra white spaces'' | ||
<pre> | |||
BG8 1920 324 693 690 700 1130 847 980 1143 680 566 | |||
BG8 1930 548 671 907 1197 902 1169 969 862 1051 1125 | |||
BG8 1940 549 361 496 767 782 168 38 -7 -7 -7 | |||
BG8 1950 102 119 31 94 110 79 118 240 245 251</pre> | |||
''Example where "-7" has been used to indicate a missing ring. | |||
<br/>A comment from a CDendro user: <i>"To give an example of when negative values are used with a particular meaning, it is possible to fill in missing data in ARSTAN. The relevant setting fills in missing data if the data points are denoted with a negative number (excluding -9999). Data points with zero values do not get filled in."</i> | |||
Revision as of 21:01, 21 October 2013
Tucson format or decadal format or rwl format is one of the most common formats for storing ring width data. It is the standard format for ITRDB. It is a text file format. Different extensions are used, such as .rwl, .crn, .tuc and .dec. (.crn is used for derivate chronologies with not exactly the same syntax). The name comes from the city of Tucson in Arizona.
Basics of the Tucson format for ring width data
A Tucson file usually consists of three lines of meta data followed by an undefined number of data lines. Ring width data is written as integers either in units of 0.01 mm or in units of 0.001 mm. A data line consist of the core identity (max 8 alphanumeric characters, i.e. letters or digits), the year of the oldest measurement of the line (4 digits) followed by ring width data, up to ten rings per line. Except for the first and last lines of each core, there are always measurements for one full decade per line.
After the youngest ring there is a stop marker as an extra value. The stop marker depends on the resolution used:
- When using 0.01 mm as the unit of measure, the stop marker is "999"
- When using 0.001 mm as the unit of measure, the stop marker is "-9999"
I.e. the stop marker is used to show not only the end of the series but also the units of measurement used (0.01 mm or 0.001 mm)!
The missing data mark is actually not defined, though usually the mark "-999" is used within 0.01 mm files, and "0" in 0.001 mm files.
Note 1: The consequence of using the value 999 as a stop marker is that a measurement of 9.99 mm, in 0.01 mm units, has to be changed
into 9.98 mm (written value=998) or 10.00 mm (1000)! To avoid fooling any other software, it is probably best never to write the value "999" as measurement data also for files with 0.001 mm units!
Note 2: If 5 digits are needed for the year number, i.e. a core older than -999, the identity cannot be more than 7 alphanumeric characters.
Note 3: An ITRDB specification of the Tucson format is available [1], though that specification cannot be
fully trusted. E.g. it currently (2009-07-09) specifies the missing data mark to be 999, which value in reality is used as the stop marker.
PMkr12b 1781 120 87 69 122 108 85 125 114 77 PMkr12b 1790 134 131 114 97 117 49 69 100 123 89 PMkr12b 1800 137 89 -999 79 44 38 62 99 68 26 PMkr12b 1810 27 43 51 57 36 999
An example of a sample saved in 0.01 mm units, which covers the timespan AD 1781-1814, with a missing ring for AD 1802 (-999). The width of AD 1781 (the first year) annual ring is 1.20 mm and for 1782 AD 0.87 mm.
PMkr12b 1781 1200 870 690 1220 1080 850 1250 1140 770 PMkr12b 1790 1340 1310 1140 970 1170 490 690 1000 1230 890 PMkr12b 1800 1370 890 0 790 440 380 620 990 680 260 PMkr12b 1810 270 430 510 570 360 -9999
The same sample written in 0.001 mm units. Note the missing data mark which is here "0".
6682 1980 143 231 154 145 150 201 130 156 245 137 6682 1990 141 202 120 96 999 NM002 1632 90 92 91 174 84 45 185 111 NM002 1640 116 72 91 49 85 146 125 126 136 131
The usual ending and start of samples (0.01 mm units resolution)
Unexpected examples
The Tucson format standard is sometimes interpreted in ways that will make the programming a bit hard. The following examples are taken from files in ITRDB but also from other sources.
SH387C 1170 14 16 14 19 22 22 26 16 23 23 SH387C 1180 17 11 14 12 999 0 0 0 0 0 SH387D 1078 48 48 SH387D 1080 50 42 46 62 49 53 41 28 17 31
An example from brit9.rwl[2] where the positions after the end mark are filled out with "0"
Q 9730 990 72 98 112 124 107 132 137 145 114 80
This snappet from brit045.rwl[3] looks very much normal, but ends with two Asciichar(13) characters which will not be trimmed away by the Visual Basic Trim function.
WRU9 1190 190 192 218 213 204 259 206 150 178 149 WRU9 1200 198 232 151 199 175 196 9990 9990 9990 9990 WRU13 1075 9990 9990 9990 9990 9990 342 426 240 213 217
A snappet from brit5.rwl.[4] It both ends and start a sample with 9990 markers.
MWK964 1970 16 11 22 25 9 13 26 24 23 16 MWK964 1980 999 MWK965 509 62 0 0 0 0 0 0 0 0 0 MWK965 510 47 45 25 19 33 24 32 51 24 22 ... MWK401 -3550 26 21 19 20 28 21 13 11 -0 11
Example from ca535.rwl[5] where zeroes are filled into positions which are not in use and this with a -0 instead of 0 or -999 for missing rings. Each row also ends with a number of extra white spaces
BG8 1920 324 693 690 700 1130 847 980 1143 680 566 BG8 1930 548 671 907 1197 902 1169 969 862 1051 1125 BG8 1940 549 361 496 767 782 168 38 -7 -7 -7 BG8 1950 102 119 31 94 110 79 118 240 245 251
Example where "-7" has been used to indicate a missing ring.
A comment from a CDendro user: "To give an example of when negative values are used with a particular meaning, it is possible to fill in missing data in ARSTAN. The relevant setting fills in missing data if the data points are denoted with a negative number (excluding -9999). Data points with zero values do not get filled in."
606 13 1570 24 31 30 25 26 24 27 27 33 30 606 13 1580 20 999 606 13 1586 20 19 19 18 606 13 1590 27 20 20 25 22 22 23 23 10 15
Example from fran009.rwl.[6] I.e. two segments with a small gap of missing rings in between is written in the same way as two separate samples though here with the same identity.
Note, that there exist also .rwl files of the type above but with several other samples written between the segments. See e.g. Itrdb germ011.rwl where the identity 371241 starts the collection with a segment, then many other members follow and at last still a segment of 371241 ends the collection.
In CDendro these segments are handled as separate samples though they have the same identity within the .rwl file. The identity problem is then solved by giving them a temporary identity like "606 13:1" and "606 13:2"
638003 1450 148 101 114 93 81 96 100 95 79 57 638003 1460 78 119 78 86 89 64 89 89 101 99 638003 1470 94 192 172 143 81 103 91 100 122 999 638003 0 0 0 0 0 0 0 0 0 0 0 638003 10 0 0 0 0 0 0 0 0 0 0 638003 20 0 0 0 0 0 0 0 0 0 0 638003 30 0 0 0 0 0 0 0 0 0 0
Example from swit177w.rwl. [7] This is currently detected as a "jump in years" error by CDendro. Then CDendro supposed that the 999 meant a missing ring, though we could as well say that if a 999 is followed by a not anticipated year number, then that data section should be handled as a new member (a new sample) even when the member identity is not changed.
OMA0851A1623 232 126 216 213 157 258 263 . . . OMA0851A1630 246 182 174 187 240 175 169 173 149 150 ... OMA0851A1810 29 33 30 31 30 41 38 32 52 65 OMA0851A1820 47 57 55 40 999 . . . . . OMA0851B1623 216 142 209 181 207 273 265 . . . OMA0851B1630 189 172 162 153 155 145 147 150 157 155 OMA0851B1640 166 138 132 105 99 107 129 69 78 86 OMA0851B1650 140 156 60 47 35 75 26 41 50 37 ... OMA0851B1840 14 15 15 12 12 11 13 10 11 11 OMA0851B1850 6 8 999 . . . . . . . OMA0852A1692 140 161 151 144 97 122 149 160 . . OMA0852A1700 222 237 251 153 185 191 234 293 189 159 OMA0852A1710 213 182 174 213 114 136 143 129 170 130 ...
Example from the Finnish Tree ring data bank of Saima With this layout each "unused position" is marked with a dot (.).
109540 1690 48 45 45 39 39 33 33 36 33 33 109540 1700 36 30 33 27 36 39 36 39 36 39 109540 1710 39 33 33 30 33 30 22.5 22.5 25.5 25.5 109540 1720 25.5 31.5 34.5 30 37.5 45 39 37.5 40.5 36 109540 1730 30 31.5 31.5 34.5 30 30 34.5 34.5 33 36 109540 1740 31.5 30 25.5 27 30 30 24 30 999 109550 1521 138 129 114 99 99 189 120 114 111 109550 1530 111 105 99 102 105 96 90 102 81 93
A snappet from germ21.rwl.[8] This is the only case we have found with decimal values (like "22.5").
Note: The germ21.rwl file contains another severe error (a missing line of data) that makes it unreadable to CDendro.
- Tucson data created from old Catras files
S00400791970 231 122 124 128 101 118 117 127 120 142 S00400791980 106 117 124 144 122 117 117 99 131 115 S00400791990 121 90 -9999 999 S00400891828 -9999 325 S00400891830 318 211 228 236 227 345 300 350 298 287 S00400891840 222 208 262 259 221 216 255 327 309 255 .... S00400891970 185 172 211 200 143 208 203 208 185 206 S00400891980 183 240 191 221 194 197 156 156 200 198 S00400891990 219 211 -9999 999 S00400991772 -9999 -9999 -9999 -9999 162 103 133 131 S00400991780 101 141 237 233 207 132 121 222 198 206 S00400991790 226 271 387 384 273 309 273 224 269 313 S00400991800 230 246 290 233 274 199 145 162 180 213
Example created by Convert5 from files available in .cat-format in the ITRDB, christensen_denmark_oak/sverige/ (When CDendro opens such Catras files, all -9999 data is automatically removed.)
- Tab- or Space-characters as field delimiters
There are Tucson alike files with a tab-character as a delimiter between the fields.
Also one or more space characters occur as a field delimiter, especially when a Tucson file has been read in from a printed document or a .pdf file
CDendro interpretation
Using comments
When various dendrochronology programs are described as e.g. Cofecha and Arstan, it is often noted that lines that cannot be interpreted as ring width data lines are considered as comments. That feature is used by CDendro. Comments may then be bound both to the .rwl file (the sample collection) itself and to individual members (samples) of the collection.
SN 1 Saltsjobaden PISY SN 2 SWEDEN Scots pine 5917N1818E 1696 2005 SN 3 Lars-Ake Larsson SN #### Samples taken from living or fallen trees except the group SNKBxx, which are poles of an old pier SN #### (Kolbryggan) in the bay Palnasviken, which have been standing in clay for a 110 years. SN #### The sample SNSU01 is taken from the Skutudden cottage. SN001A 1923 341 374 369 298 500 382 396 SN001A 1930 332 297 421 250 290 288 317 320 256 215 SN001A 1940 111 183 229 183 159 157 163 134 105 111 SN001A 1950 81 62 89 138 164 180 138 157 170 130 SN001A 1960 108 184 137 148 124 164 80 98 105 67 SN001A 1970 57 89 126 101 114 100 83 76 71 57 SN001A 1980 114 87 79 85 56 49 61 79 81 70 SN001A 1990 99 139 132 161 132 118 999 SN001A #### You may store comments for a sample too SN001A #### and it may extend over several lines. SN001B 1923 329 375 319 299 435 366 384 SN001B 1930 287 258 392 280 251 293 296 278 200 181 SN001B 1940 114 154 224 160 138 106 127 126 97 96 SN001B 1950 68 65 83 114 168 176 91 163 189 146
Example of .rwl file with comments created with CDendro
The CDendro .rwc format for specifying the proportion of Latewood
In CooRecorder there are provisions for measuring latewood and earlywood. As a .rwl file cannot contain both latewood and earlywood, CDendro has a .rwc format which is almost the same as the .rwl format, though the latewood is specified as a sequence of permillage values at the end of the normal ring width lines:
SN 1 Saltsjobaden PISY SN 2 SWEDEN Scots pine 5917N1818E 1733 2003 SN 3 Lars-Ake Larsson SN001A 1923 341 374 369 298 500 382 396 # 79 131 63 179 143 288 139 SN001A 1930 332 297 421 250 290 288 317 320 256 215 # 245 190 300 76 201 267 288 326 281 230 SN001A 1940 111 183 229 183 159 157 163 134 105 111 # 101 160 210 214 151 243 173 49 123 139 SN001A 1950 81 62 89 138 164 180 138 157 170 130 # 55 47 100 189 283 25 167 293 350 26 SN001A 1960 108 184 137 148 124 164 80 98 105 67 # 392 377 195 209 99 280 108 187 195 108 SN001A 1970 57 89 126 101 114 100 83 76 71 57 # 247 315 209 105 331 152 98 248 263 301 SN001A 1980 114 87 79 85 56 49 61 79 81 70 # 412 199 253 77 203 190 146 215 391 206 SN001A 1990 99 139 132 161 132 118 999 # 518 345 278 369 229 163 SN001B 1923 329 375 319 299 435 366 384 # 99 122 73 135 146 223 139 SN001B 1930 287 258 392 280 251 293 296 278 200 181 # 239 126 150 126 232 178 183 326 363 228
The ring width value of 1923 is 3.41 mm with the latewood being 0.079*3.41=0.27 mm.
Meta data
Meta data that is collected from e.g. a Heidelberg file can be written by CDendro to a .rwl file:
... 1AD0046A1500 54 64 64 67 64 54 61 71 78 81 1AD0046A1510 84 94 105 84 108 108 115 88 135 108 1AD0046A1520 111 98 121 94 138 118 111 94 108 105 1AD0046A1530 999 1AD0046A#### Location=Stockholm; Species=PISY; TreeNo=0; 1AD0046A#### CoreNo=0; Project=123;
Naming standard
Limitations of the Tucson format
- The amount of meta data is limited to what is specified for the three first lines of a .rwl file. Then we should also be aware that the syntax of that meta data is not very specified, i.e. in practice it will be almost free text.
- If meta data is stored as comments with a special syntax, then we have to recognize that there is no common specification for how that meta data should be named and specified.
- There is no specification on how Latewood and Earlywood should be stored within the same .rwl file. Within the ITRDB Latewood and Earlywood data is saved in separate .rwl files.
- There is no specification on e.g. a naming standard to allow for keeping radii from the same stem together, though see also CDendro naming standard
The Tucson chronology .crn format
Citation from http://www.ncdc.noaa.gov/paleo/treeinfo.html
Processed Data Files (Site Chronologies, File Extension .CRN) These are the standardized tree-growth indices from a stand of trees, representing the mean growth observed for each year over the entire stand. Site chronologies are used in climate analysis. Data are stored as 3 or 4-digit numbers, with a value of 1000 representing mean growth, a minimum value of 0 (no growth), and no defined maximum. There is only one time series per file, in contrast to the raw data files. Missing value code is 9990. Site information is stored in the first 3 records of the file. Format for chronology header records: Record #1: 1-6 Site ID, 10-61 Site Name, 62-65 Species Code, optional ID#'s Record #2: 1-6 Site ID, 10-22 State/Country, 23-30 Species, 41-45 Elevation, 48-57 Lat-Long, 68-76 1st & last Year Note: lat-lons are in degrees and minutes, ddmm or dddmm Record #3: 1-6 Site ID, 10-72 Lead Investigator, 73-80 comp. date Chronology Data, Records 4-?? Site ID# column 1-6 Decade column 7-10 Index Value-Sample Number* pairs of values, columns 11-80, 10(I4+I3) TRL ID#(optional) column 82-88 *Index Values, columns 11-14,18-21,25-28,32-35,etc # of samples used in calculating chronology, columns 15-17,22-24,29-31,36-38,etc. Example:1450 670 171018 17 897 18... Here, 670 is the ring-width index value for the year 1450, with a sample size of 17; 1018 is the ring-width index value for the year 1451, with a sample size of 17; 897 is the ring-width index value for the year 1452, with a sample size of 18 Chronology Statistics, Last Record, Optional: Site ID# column 1-6 Number of Years column 8-10 First Order Autocorrelation column 13-16 Standard Deviation column 19-22 Mean Sensitivity column 25-28 Mean Index Value column 29-35 Sum of Indices column 37-44 Sum of Squares of Indices column 46-53 Max# of series column 62-63
Example:
ANEBY 1 Anebymossen PISY ANEBY 2 Sweden Scots Pine 295M 5751 01438 __ 1846 1996 ANEBY 3 Hans Linderholm ANEBYS18469990 09990 09990 09990 09990 09990 0 798 11495 1 779 1 976 1 ANEBYS1850 869 1 984 11116 1 881 2 936 2 843 2 394 2 613 21538 21406 2 ANEBYS18601466 2 967 21468 3 735 3 701 31041 3 776 31116 31271 3 804 3 ANEBYS18701270 3 821 31034 4 772 4 818 41011 41067 41052 41073 61195 6 ANEBYS18801523 71086 8 926 8 771 9 986 9 950 9 992 9 807 9 908 91041 9 ANEBYS18901169 9 920 91274 91259 91074 10 974 101198 101208 111371 121022 12 ANEBYS1900 774 121071 12 755 12 851 12 721 12 730 12 814 12 758 12 642 12 727 12 ANEBYS19101101 12 987 13 830 14 731 141001 141022 14 991 14 701 14 992 141047 14 ANEBYS1920 973 141158 141149 141212 14 960 16 978 17 992 17 957 17 660 17 498 17 ANEBYS1930 539 17 679 17 774 17 903 171183 171261 171490 171603 181441 181407 19 ANEBYS19401087 191163 191048 20 981 20 900 201398 201485 201044 21 900 211003 21 ANEBYS1950 974 21 962 21 836 21 975 21 881 21 891 21 822 21 967 211093 21 905 21 ANEBYS1960 702 21 843 21 803 21 829 21 901 21 900 21 908 211256 211256 211122 21 ANEBYS1970 846 21 852 211198 211274 211026 211320 211507 211178 211384 211313 21 ANEBYS19801232 211075 21 861 21 977 21 984 21 945 21 956 21 922 21 779 21 819 21 ANEBYS1990 703 21 889 21 774 21 892 21 939 211097 21 820 219990 09990 09990 0
Example from swed314.crn[9] showing the structure of a chronology file.
CDendro interpretation
CDendro allows for storing extra comments in the header not only of ordinary .rwl files but also in .crn files. Example:
NORRW 1 Norrland wood PISY NORRW 2 Sweden Scots pine 6300N1800E 1513 1874 NORRW 3 Lars-Ake Larsson #Based on #A. samples from boards (outside wooden panel) of the Sandviken house, Vastanvik, Namdo. #B. a sample from a board in an old barn at Langholmen, Runmaro, an island east of Stockholm. #C. a house built in 1907 in Saltsjobaden (Palnasv 1) #The wood obviously seems to be imported from long way north of the Stockholm area. #CDENDRO_DC This chronology is intended for crossdating purpose only! Standardization: NegExpDetrend SumByStem=Yes NorTrd15139990 09990 09990 01544 11752 11690 11225 11713 11345 11239 1 NorTrd1520 823 11235 11489 11208 1 834 11265 11050 11416 11923 12001 1 NorTrd15301797 12041 12305 12632 13319 13204 12575 13262 12438 12525 1 NorTrd15401932 12727 12650 11674 21748 21832 21842 21856 21438 21350 2 NorTrd15501305 21713 21395 21572 21233 21572 21531 21439 21546 31482 3 ......
See also
See also Dendro data format