User Tools

Site Tools


data:data_analysis_manual:read_catalog_python

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
data:data_analysis_manual:read_catalog_python [2021/11/12 11:14]
eric buchlin How to use the catalogue
data:data_analysis_manual:read_catalog_python [2022/08/27 09:54]
eric buchlin Mention CSV catalog
Line 1: Line 1:
 ====== How to read the UiO FITS files catalog in Python ====== ====== How to read the UiO FITS files catalog in Python ======
 +
 +The following script is for reading the text format catalog. Now UiO also provides a CSV catalog.
  
 <file python read_uio_cat.py>​ <file python read_uio_cat.py>​
Line 7: Line 9:
 # SPICE data tree path, to be changed to your SPICE data mirror # SPICE data tree path, to be changed to your SPICE data mirror
 data_path = "/​archive/​SOLAR-ORBITER/​SPICE" ​     # example for IAS computing servers data_path = "/​archive/​SOLAR-ORBITER/​SPICE" ​     # example for IAS computing servers
 +
 +
 +def date_parser(string):​
 +    try:
 +        return pd.Timestamp(string)
 +    except ValueError:
 +         ​return pd.NaT
 +
  
 def read_uio_cat():​ def read_uio_cat():​
Line 24: Line 34:
     """​     """​
     cat_file = Path(data_path) / "​fits"​ / "​spice_catalog.txt"​     cat_file = Path(data_path) / "​fits"​ / "​spice_catalog.txt"​
 +    if not cat_file.exists():​
 +        print(f'​Error:​ Catalog file not available at {cat_file.as_posix()}'​)
 +        sys.exit(1)
     columns = list(pd.read_csv(cat_file,​ nrows=0).keys())     columns = list(pd.read_csv(cat_file,​ nrows=0).keys())
     date_columns = ['​DATE-BEG','​DATE',​ '​TIMAQUTC'​]     date_columns = ['​DATE-BEG','​DATE',​ '​TIMAQUTC'​]
-    df = pd.read_table(cat_file,​ skiprows=1, names=columns, na_values="​MISSING"​+    df = pd.read_table(cat_file,​ skiprows=1, names=columns,​ 
-                    parse_dates=date_columns, ​warn_bad_lines=True) +                    parse_dates=date_columns, ​date_parser=date_parser,​ 
-    ​df.LEVEL = df.LEVEL.apply(lambda string: string.strip()) +                    ​low_memory=False)
-    df.STUDYTYP ​df.STUDYTYP.apply(lambda string: string.strip())+
     return df     return df
 </​file>​ </​file>​
  
-''​na_values="​MISSING"''​ replaces the string "​MISSING"​ by NaNs, it can be removed. 
  
 Then we can read the catalog and filter it: Then we can read the catalog and filter it:
data/data_analysis_manual/read_catalog_python.txt · Last modified: 2024/03/29 14:11 by eric buchlin