User Tools

Site Tools


data:data_analysis_manual:read_catalog_python

How to read and use the UiO FITS files catalog in Python

read_uio_cat_csv.py
from pathlib import Path
import pandas as pd
 
# SPICE data tree path, to be changed to your SPICE data mirror
data_path = "/archive/SOLAR-ORBITER/SPICE"      # example for IAS computing servers
 
 
def date_parser(string):
    try:
        return pd.Timestamp(string)
    except ValueError:
         return pd.NaT
 
 
def read_uio_cat():
    """
    Read UiO SPICE FITS files CSV catalog
    http://astro-sdc-db.uio.no/vol/spice/fits/spice_catalog.csv
 
    Return
    ------
    pandas.DataFrame
        Table
    """
    cat_file = Path(data_path) / "fits" / "spice_catalog.csv"
    if not cat_file.exists():
        print(f'Error: Catalog file not available at {cat_file.as_posix()}')
        sys.exit(1)
    date_columns = ['DATE-BEG','DATE', 'TIMAQUTC']
    df = pd.csv(cat_file, parse_dates=date_columns, date_parser=date_parser)
    return df

The same applies for the catalog included in the data releases (here: release 2.0), which can simply be read by:

read_release_cat.py
import pandas as pd
 
def date_parser(string):
    try:
        return pd.Timestamp(string)
    except ValueError:
         return pd.NaT
 
date_columns = ['DATE-BEG','DATE', 'TIMAQUTC']
cat = pd.read_csv(
    'https://spice.osups.universite-paris-saclay.fr/spice-data/release-2.0/catalog.csv',
    date_parser=date_parser,
    parse_dates=date_columns
)
# TODO interpret the JSON included in columns `proc_steps` and `windows`.

Text catalog

read_uio_cat_txt.py
from pathlib import Path
import pandas as pd
 
# SPICE data tree path, to be changed to your SPICE data mirror
data_path = "/archive/SOLAR-ORBITER/SPICE"      # example for IAS computing servers
 
 
def date_parser(string):
    try:
        return pd.Timestamp(string)
    except ValueError:
         return pd.NaT
 
 
def read_uio_cat():
    """
    Read UiO text table SPICE FITS files catalog
    http://astro-sdc-db.uio.no/vol/spice/fits/spice_catalog.txt
 
    Return
    ------
    pandas.DataFrame
        Table
    """
    cat_file = Path(data_path) / "fits" / "spice_catalog.txt"
    if not cat_file.exists():
        print(f'Error: Catalog file not available at {cat_file.as_posix()}')
        sys.exit(1)
    columns = list(pd.read_csv(cat_file, nrows=0).keys())
    date_columns = ['DATE-BEG','DATE', 'TIMAQUTC']
    df = pd.read_table(cat_file, skiprows=1, names=columns,
                    parse_dates=date_columns, date_parser=date_parser,
                    low_memory=False)
    return df

Using the catalog

Then we can read the catalog and filter it:

filter_cat.py
cat = read_uio_cat()
filtered_cat = cat[(cat['DATE-BEG'] > '2021-11-05') & (cat.LEVEL == 'L2')]

cat then contains the full catalogue (as a pandas dataframe) and filtered_cat contains a catalogue in which rows have been filtered (in this particular case) by observation date and file level.

The catalogue or filtered catalogue can be exported, e.g. to a CSV table by cat.to_csv().

data/data_analysis_manual/read_catalog_python.txt · Last modified: 2022/09/21 13:57 by gabriel pelouze