Team-based matrix-free Gradient operator for the spherical shell, with the same fused-arithmetic optimisations applied to DivergenceKerngen / EpsilonDivDivKerngen. More...

#include "../../quadrature/quadrature.hpp"
#include "communication/shell/communication.hpp"
#include "communication/shell/communication_plan.hpp"
#include "dense/vec.hpp"
#include "fe/wedge/integrands.hpp"
#include "fe/wedge/kernel_helpers.hpp"
#include "grid/shell/spherical_shell.hpp"
#include "linalg/operator.hpp"
#include "linalg/trafo/local_basis_trafo_normal_tangential.hpp"
#include "linalg/vector.hpp"
#include "linalg/vector_q1.hpp"
#include "util/timer.hpp"

Classes
class	terra::fe::wedge::operators::shell::GradientKerngen< ScalarT >

Namespaces
namespace	terra

namespace	terra::fe

namespace	terra::fe::wedge
	Features for wedge elements.

namespace	terra::fe::wedge::operators

namespace	terra::fe::wedge::operators::shell

Detailed Description

Team-based matrix-free Gradient operator for the spherical shell, with the same fused-arithmetic optimisations applied to DivergenceKerngen / EpsilonDivDivKerngen.

Structure mirrors DivergenceKerngen — but Gradient is the transpose of Divergence:

src: scalar pressure on the coarse grid (Grid4DDataScalar)
dst: vec3 velocity on the fine grid (Grid4DDataVec<..,3>)

Fused arithmetic derivation. The original element-matrix form is A[wedge](d·6+i, j) = -qw · |det J| · (J⁻ᵀ ∇N_i)_d · shape_coarse_j so dst_{d,i} = Σⱼ A(d·6+i, j) · p_j = -qw · |det J| · (J⁻ᵀ ∇N_i)_d · Σⱼ shape_coarse_j · p_j = -qw · |det J| · (J⁻ᵀ ∇N_i)_d · p_interp, where p_interp is the interpolated coarse pressure at the quadrature point. We never materialise A.

Per wedge:

Compute J, |det J|, J⁻¹.
Compute p_interp = Σⱼ shape_coarse_j(q) · p_j (6 muls).
prefactor = -qw · |det J| · p_interp.
For each fine node i ∈ 0..5: g = J⁻ᵀ · dN_ref_i (9 FMAs) contribution_d = prefactor · g_d (3 muls) Dirichlet: skip boundary-node scatter. Freeslip: project contribution onto tangent plane at boundary nodes. atomic_add into dst_(s, x+ddx, y+ddy, r+ddr, d) for d=0,1,2.

The freeslip projection uses the fact that on a spherical shell the outward normal at a lateral node equals the unit-sphere coord already cached in coords_sh — same trick as DivergenceKerngen, applied to the output side here.

Classes

Namespaces

Detailed Description