User Tools

Site Tools


data:data_analysis_manual:read_catalog_python

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
data:data_analysis_manual:read_catalog_python [2021/06/30 10:54]
eric buchlin moved
data:data_analysis_manual:read_catalog_python [2022/08/27 10:01]
eric buchlin Add function to read the CSV catalog
Line 1: Line 1:
-====== How to read the UiO FITS files catalog in Python ======+====== How to read and use the UiO FITS files catalog in Python ====== 
 + 
 +===== CSV catalog (new, recommended) ​=====
  
 <file python read_uio_cat.py>​ <file python read_uio_cat.py>​
Line 7: Line 9:
 # SPICE data tree path, to be changed to your SPICE data mirror # SPICE data tree path, to be changed to your SPICE data mirror
 data_path = "/​archive/​SOLAR-ORBITER/​SPICE" ​     # example for IAS computing servers data_path = "/​archive/​SOLAR-ORBITER/​SPICE" ​     # example for IAS computing servers
 +
 +
 +def date_parser(string):​
 +    try:
 +        return pd.Timestamp(string)
 +    except ValueError:
 +         ​return pd.NaT
 +
  
 def read_uio_cat():​ def read_uio_cat():​
     """​     """​
-    Read UiO text table SPICE FITS files catalog +    Read UiO SPICE FITS files CSV catalog 
-    http://​astro-sdc-db.uio.no/​vol/​spice/​fits/​spice_catalog.txt+    http://​astro-sdc-db.uio.no/​vol/​spice/​fits/​spice_catalog.csv
  
     Return     Return
Line 17: Line 27:
     pandas.DataFrame     pandas.DataFrame
         Table         Table
 +    """​
 +    cat_file = Path(data_path) / "​fits"​ / "​spice_catalog.txt"​
 +    if not cat_file.exists():​
 +        print(f'​Error:​ Catalog file not available at {cat_file.as_posix()}'​)
 +        sys.exit(1)
 +    date_columns = ['​DATE-BEG','​DATE',​ '​TIMAQUTC'​]
 +    df = pd.read_table(cat_file,​ parse_dates=date_columns,​ date_parser=date_parser)
 +    return df
 +</​file>​
  
-    Example queries that can be done on the result: 
  
-    * `df[(df.LEVEL ​== "​L2"​) & (df["​DATE-BEG"​] ​>= "2020-11-17"​) ​(df["DATE-BEG"] < "2020-11-18") & (df.XPOSURE > 60.)]` +===== Text catalog ===== 
-    ​* `df[(df.LEVEL == "​L2"​) & (df.STUDYDES == "​Standard dark for cruise phase"​)]`+ 
 +<file python read_uio_cat.py> 
 +from pathlib import Path 
 +import pandas as pd 
 + 
 +# SPICE data tree path, to be changed to your SPICE data mirror 
 +data_path ​= "/​archive/​SOLAR-ORBITER/​SPICE" ​     # example for IAS computing servers 
 + 
 + 
 +def date_parser(string)
 +    try: 
 +        return pd.Timestamp(string) 
 +    except ValueError:​ 
 +         ​return pd.NaT 
 + 
 + 
 +def read_uio_cat():​ 
 +    ​"""​ 
 +    Read UiO text table SPICE FITS files catalog 
 +    http://​astro-sdc-db.uio.no/​vol/​spice/​fits/​spice_catalog.txt 
 + 
 +    ​Return 
 +    ------ 
 +    pandas.DataFrame 
 +        Table
     """​     """​
     cat_file = Path(data_path) / "​fits"​ / "​spice_catalog.txt"​     cat_file = Path(data_path) / "​fits"​ / "​spice_catalog.txt"​
 +    if not cat_file.exists():​
 +        print(f'​Error:​ Catalog file not available at {cat_file.as_posix()}'​)
 +        sys.exit(1)
     columns = list(pd.read_csv(cat_file,​ nrows=0).keys())     columns = list(pd.read_csv(cat_file,​ nrows=0).keys())
     date_columns = ['​DATE-BEG','​DATE',​ '​TIMAQUTC'​]     date_columns = ['​DATE-BEG','​DATE',​ '​TIMAQUTC'​]
-    df = pd.read_table(cat_file,​ skiprows=1, names=columns, na_values="​MISSING"​+    df = pd.read_table(cat_file,​ skiprows=1, names=columns,​ 
-                    parse_dates=date_columns, ​warn_bad_lines=True) +                    parse_dates=date_columns, ​date_parser=date_parser,​ 
-    ​df.LEVEL = df.LEVEL.apply(lambda string: string.strip()) +                    ​low_memory=False)
-    df.STUDYTYP ​df.STUDYTYP.apply(lambda string: string.strip())+
     return df     return df
 </​file>​ </​file>​
  
-''​na_values="​MISSING"''​ replaces the string "​MISSING"​ by NaNs, it can be removed. 
  
 +===== Using the catalog =====
 +
 +Then we can read the catalog and filter it:
 +
 +<file python filter_cat.py>​
 +cat = read_uio_cat()
 +filtered_cat = cat[(cat['​DATE-BEG'​] > '​2021-11-05'​) & (cat.LEVEL == '​L2'​)]
 +</​file>​
 +
 +`cat` then contains the full catalogue (as a `pandas` dataframe) and `filtered_cat` contains a catalogue in which rows have been filtered (in this particular case) by observation date and file level.
 +
 +The catalogue or filtered catalogue can be exported, e.g. to a CSV table by `cat.to_csv()`.
data/data_analysis_manual/read_catalog_python.txt · Last modified: 2024/03/29 14:11 by eric buchlin