This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
data:data_analysis_manual:read_catalog_python [2021/11/12 11:14] eric buchlin How to use the catalogue |
data:data_analysis_manual:read_catalog_python [2022/08/27 09:54] eric buchlin Mention CSV catalog |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== How to read the UiO FITS files catalog in Python ====== | ====== How to read the UiO FITS files catalog in Python ====== | ||
+ | |||
+ | The following script is for reading the text format catalog. Now UiO also provides a CSV catalog. | ||
<file python read_uio_cat.py> | <file python read_uio_cat.py> | ||
Line 7: | Line 9: | ||
# SPICE data tree path, to be changed to your SPICE data mirror | # SPICE data tree path, to be changed to your SPICE data mirror | ||
data_path = "/archive/SOLAR-ORBITER/SPICE" # example for IAS computing servers | data_path = "/archive/SOLAR-ORBITER/SPICE" # example for IAS computing servers | ||
+ | |||
+ | |||
+ | def date_parser(string): | ||
+ | try: | ||
+ | return pd.Timestamp(string) | ||
+ | except ValueError: | ||
+ | return pd.NaT | ||
+ | |||
def read_uio_cat(): | def read_uio_cat(): | ||
Line 24: | Line 34: | ||
""" | """ | ||
cat_file = Path(data_path) / "fits" / "spice_catalog.txt" | cat_file = Path(data_path) / "fits" / "spice_catalog.txt" | ||
+ | if not cat_file.exists(): | ||
+ | print(f'Error: Catalog file not available at {cat_file.as_posix()}') | ||
+ | sys.exit(1) | ||
columns = list(pd.read_csv(cat_file, nrows=0).keys()) | columns = list(pd.read_csv(cat_file, nrows=0).keys()) | ||
date_columns = ['DATE-BEG','DATE', 'TIMAQUTC'] | date_columns = ['DATE-BEG','DATE', 'TIMAQUTC'] | ||
- | df = pd.read_table(cat_file, skiprows=1, names=columns, na_values="MISSING", | + | df = pd.read_table(cat_file, skiprows=1, names=columns, |
- | parse_dates=date_columns, warn_bad_lines=True) | + | parse_dates=date_columns, date_parser=date_parser, |
- | df.LEVEL = df.LEVEL.apply(lambda string: string.strip()) | + | low_memory=False) |
- | df.STUDYTYP = df.STUDYTYP.apply(lambda string: string.strip()) | + | |
return df | return df | ||
</file> | </file> | ||
- | ''na_values="MISSING"'' replaces the string "MISSING" by NaNs, it can be removed. | ||
Then we can read the catalog and filter it: | Then we can read the catalog and filter it: |