CLEDB_PREPINV - Pre-processing

Purpose:

The CLEDB_PREVINV module processes both the input data and the CLEDB_BUILD generated databases to prepare for the main inversion processing.

For 1-line cases, only the observation is pre-processed. Observation keywords are ingested, the geometric height is computed and the spectroscopic IQUV profiles are integrated.

In addition for 2-line observations, the observation linear polarization is rotated to match the database calculation as described in Paraschiv & Judge, SolPhys, 2022. The module then performs a height match between the input data and database configuration. Only the optimal subset of database height entries are pre-loaded into memory to minimize I/O operations but also to avoid I/O bottlenecks when running the analysis routines of the CLEDB_PROC module.

CLEDB_PREPINV Module Functions

Note

The \(\diamond\), \(\triangleright\), and \(\triangleright\triangleright\) symbols respectively denote main, secondary, and tertiary (helper) level functions. Main functions are called by the example scripts. Secondary functions are called by the main functions, and tertiary from either main or secondary functions.

\(\diamond\) SOBS_PREPROCESS

Main function to process an input observation and ingest the relevant header keywords. Generally, this function iterates over the observation maps and sends each pixel to the internal functions. It returns a processed observation array (input dependent) that is ready for analysis. Additional products are calculated. e.g. a height map (used to match databases), signal statistics, etc. via its subfunctions.

\(\triangleright\) OBS_CALCHEIGHT

Calculates height map of the same xy dimensions as the input array. Each pixel encodes the solar height in units of R\(_\odot\).

\(\triangleright\) OBS_INTEGRATE

Estimates background counts using a cumulative distribution function (CDF) statistical method, then integrates along the wavelength dimension, in all IQUV components of all input lines. Profile integration is required because the database dimensionality and inversion computational times would not be feasible when processing full-spectra observations. See Paraschiv & Judge, SolPhys, 2022 for additional information.

\(\triangleright\triangleright\) OBS_CDF

Computes the CDF distribution from spectra corresponding to one voxel.

\(\triangleright\) OBS_QUROTATE

If ingested a 2-line observation, the Stokes Q and U components are rotated to match the database’s reference direction with the pbservation reference direction. The observation reference direction should be read as input from the header metadata. This enables using just a 1D database computation (along y heights) to match the any observed Stokes profiles, under any linear polarization reference in a 2D map with these applied rotational transforms. See Paraschiv & Judge, SolPhys, 2022 for extended information.

\(\diamond\) SDB_PREPROCESS

Main function for selecting and reading into memory the optimal database calculations that are compatible with the observations processed via SOBS_PREPROCESS.

\(\triangleright\) SDB_FILEINGEST

Glob the database directory to ingest available database heights and process the database configuration using the header metadata.

\(\triangleright\) SDB_FINDELONGATION

Compares the database entries (along ny) with the heights covered by the observation to deduce the closest matching database entries and minimizes the number of database DBXXXX.dat files to be read into memory.

\(\triangleright\) SDB_PARSEHEADER

Parses the database header information from the db.hdr file.

\(\triangleright\triangleright\) SDB_LCGRID

Computes the grid spacing for the logarithm of density ranges covered in the database. The grid is correspondent to the density configuration in DB.INPUT of the database calculations.

\(\triangleright\) SDB_READ

Reads all needed database files. This concludes all the disk I/O operations done during one run of the inversion.

Warning

As databases are written as binary files, the variable type fed to the np.fromfile reader needs to match the Fortran datatype CLE dbe.f uses to write the calculations. Currently these are set as single precision floats of np.float32 and REAL types respectively.

The CLEDB_PREPINV module is not fully compatible with Numba non-python mode, due to disk I/O operations. All non-python compatible functions are enabled in non-python mode while the rest are compiled in object-mode via the hard-coded “forcedobj=True” flag in the @jit decorators. The Python Modules section provides more details on the differences between the two Numba modes. The algorithm flow is described in the below diagram.

_images/3_CLEDB_PREP.png

CLEDB_PREPINV Main Variables

sobs_tot [xs,ys,nline*4] float array

Contains the background subtracted, integrated, and normalized Stokes IQUV spectra for 1-line ([xs,ys,4]) or 2-line ([xs,ys,8]) observations.

sobs_totrot [xs,ys,nline*4] float array

Derived from sobs_tot. The Stokes Q and U components are rotated along the center of the Sun to match the reference direction for linear polarization (the reference in which the database is created by CLEDB_BUILD). In inner functions of CLEDB_PROC only one pixel is passed at a time as sobs_1pix. The variable is initialized as a “zero” array that is returned in the case of 1-line observations to keep a standardized function input/output needed for Numba vectorization.

background [xs,ys,nline*4] float array

Returns averaged background counts for each observed voxel and each Stokes component.

rms [xs,ys,nline*4] float array

Returns the root mean square (RMS) of the total counts in each Stokes profile. The rms calculation is correspondent to the ratio between intensity in the line core and background counts (the variance). This measurement shows the quality in the signal for a particular observed voxel.

yobs [xs,ys] float array

The header keyword input is used to construct a height projection for each observed voxel in units or R\(_\odot\). In inner functions of CLEDB_PROC only one pixel is passed at a time as yobs_1pix.

aobs [xs,ys] float array

Stores the linear polarization angle transformation performed by the OBS_QUROTATE function. This information will be used to derotate the matched database profiles found by the CLEDB_INVPROC 2-line inversion function for comparison. In inner functions of CLEDB_PROC only one pixel is passed at a time as aobs_1pix. The variable is initialized and returned as a “zero” array in the case of 1-line observations due to Numba vectorization requirements.

dbsubdirs [string] or [string list]

Contains the directory structure formatted as described in the CLEDB_BUILD Output section.

database [ned,nx,nbphi,nbtheta,nline*4] list of float arrays

The list is the minimal subset of databases that are compatible with the observation taken from the set of ny entries of the database.

dbhdr [ints, floats and strings] list

Database header information containing the ranges and physical parameters used to generate the database.

db_enc [xs,ys] float array

Keeps an encoding of which of the memory loaded databases (elements in list of databases) to use for matching in each pixel.

issuemask [xs,ys] float array

An array that encodes issues appearing during processing. This array will be updated across all modules. The tentative issuemask implementation is described separately.

Note

Input variables, e.g. header *keys, sobs_in, ctrlparams, constants, etc. that are described in the Input Variables and Parameters section are not repeated in this section.