Functional Interface

Full Reference Metrics

Peak Signal-to-Noise Ratio (PSNR)

piq.psnr(x: Tensor, y: Tensor, data_range: Union[int, float] = 1.0, reduction: str = 'mean', convert_to_greyscale: bool = False) → Tensor

Compute Peak Signal-to-Noise Ratio for a batch of images. Supports both greyscale and color images with RGB channel order.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
data_range – Maximum value range of images (usually 1.0 or 255).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
convert_to_greyscale – Convert RGB image to YIQ format and computes PSNR only on luminance channel if True. Compute on all 3 channels otherwise.

Returns:

PSNR Index of similarity between two images.

References

https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio

Structural Similarity (SSIM)

piq.ssim(x: Tensor, y: Tensor, kernel_size: int = 11, kernel_sigma: float = 1.5, data_range: Union[int, float] = 1.0, reduction: str = 'mean', full: bool = False, downsample: bool = True, k1: float = 0.01, k2: float = 0.03) → List[Tensor]

Interface of Structural Similarity (SSIM) index. Inputs supposed to be in range [0, data_range]. To match performance with skimage and tensorflow set 'downsample' = True.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\) or \((N, C, H, W, 2)\).
y – A target tensor. Shape \((N, C, H, W)\) or \((N, C, H, W, 2)\).
kernel_size – The side-length of the sliding window used in comparison. Must be an odd value.
kernel_sigma – Sigma of normal distribution.
data_range – Maximum value range of images (usually 1.0 or 255).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
full – Return cs map or not.
downsample – Perform average pool before SSIM computation. Default: True
k1 – Algorithm parameter, K1 (small constant).
k2 – Algorithm parameter, K2 (small constant). Try a larger K2 constant (e.g. 0.4) if you get a negative or NaN results.

Returns:

Value of Structural Similarity (SSIM) index. In case of 5D input tensors, complex value is returned as a tensor of size 2.

References

Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13, 600-612. https://ece.uwaterloo.ca/~z70wang/publications/ssim.pdf, DOI: 10.1109/TIP.2003.819861

Multi-Scale Structural Similarity (MS-SSIM)

piq.multi_scale_ssim(x: Tensor, y: Tensor, kernel_size: int = 11, kernel_sigma: float = 1.5, data_range: Union[int, float] = 1.0, reduction: str = 'mean', scale_weights: Optional[Tensor] = None, k1: float = 0.01, k2: float = 0.03) → Tensor

Interface of Multi-scale Structural Similarity (MS-SSIM) index. Inputs supposed to be in range [0, data_range] with RGB channels order for colour images.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\) or \((N, C, H, W, 2)\).
y – A target tensor. Shape \((N, C, H, W)\) or \((N, C, H, W, 2)\).
kernel_size – The side-length of the sliding window used in comparison. Must be an odd value.
kernel_sigma – Sigma of normal distribution.
data_range – Maximum value range of images (usually 1.0 or 255).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
scale_weights – Weights for different scales. If None, default weights from the paper will be used. Default weights: (0.0448, 0.2856, 0.3001, 0.2363, 0.1333).
k1 – Algorithm parameter, K1 (small constant).
k2 – Algorithm parameter, K2 (small constant). Try a larger K2 constant (e.g. 0.4) if you get a negative or NaN results.

Returns:

Value of Multi-scale Structural Similarity (MS-SSIM) index. In case of 5D input tensors, complex value is returned as a tensor of size 2.

References

Wang, Z., Simoncelli, E. P., Bovik, A. C. (2003). Multi-scale Structural Similarity for Image Quality Assessment. IEEE Asilomar Conference on Signals, Systems and Computers, 37, https://ieeexplore.ieee.org/document/1292216 DOI:10.1109/ACSSC.2003.1292216

Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13, 600-612. https://ece.uwaterloo.ca/~z70wang/publications/ssim.pdf, DOI: 10.1109/TIP.2003.819861

Note

The size of the image should be at least (kernel_size - 1) * 2 ** (levels - 1) + 1.

Information Content Weighted Structural Similarity (IW-SSIM)

piq.information_weighted_ssim(x: Tensor, y: Tensor, data_range: Union[int, float] = 1.0, kernel_size: int = 11, kernel_sigma: float = 1.5, k1: float = 0.01, k2: float = 0.03, parent: bool = True, blk_size: int = 3, sigma_nsq: float = 0.4, scale_weights: Optional[Tensor] = None, reduction: str = 'mean') → Tensor

Interface of Information Content Weighted Structural Similarity (IW-SSIM) index. Inputs supposed to be in range [0, data_range].

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
data_range – Maximum value range of images (usually 1.0 or 255).
kernel_size – The side-length of the sliding window used in comparison. Must be an odd value.
kernel_sigma – Sigma of normal distribution for sliding window used in comparison.
k1 – Algorithm parameter, K1 (small constant).
k2 – Algorithm parameter, K2 (small constant). Try a larger K2 constant (e.g. 0.4) if you get a negative or NaN results.
parent – Flag to control dependency on previous layer of pyramid.
blk_size – The side-length of the sliding window used in comparison for information content.
sigma_nsq – Parameter of visual distortion model.
scale_weights – Weights for scaling.
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'

Returns:

Value of Information Content Weighted Structural Similarity (IW-SSIM) index.

References

Wang, Zhou, and Qiang Li.. Information content weighting for perceptual image quality assessment. IEEE Transactions on image processing 20.5 (2011): 1185-1198. https://ece.uwaterloo.ca/~z70wang/publications/IWSSIM.pdf DOI:10.1109/TIP.2010.2092435

Note

Lack of content in target image could lead to RuntimeError due to singular information content matrix, which cannot be inverted.

Visual Information Fidelity (VIFp)

piq.vif_p(x: Tensor, y: Tensor, sigma_n_sq: float = 2.0, data_range: Union[int, float] = 1.0, reduction: str = 'mean') → Tensor

Compute Visiual Information Fidelity in pixel domain for a batch of images. This metric isn’t symmetric, so make sure to place arguments in correct order. Both inputs supposed to have RGB channels order.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
sigma_n_sq – HVS model parameter (variance of the visual noise).
data_range – Maximum value range of images (usually 1.0 or 255).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'

Returns:

VIF Index of similarity between two images. Usually in [0, 1] interval. Can be bigger than 1 for predicted \(x\) images with higher contrast than original one.

References

H. R. Sheikh and A. C. Bovik, “Image information and visual quality,” IEEE Transactions on Image Processing, vol. 15, no. 2, pp. 430-444, Feb. 2006 https://ieeexplore.ieee.org/abstract/document/1576816/ DOI: 10.1109/TIP.2005.859378.

Note

In original paper this method was used for bands in discrete wavelet decomposition. Later on authors released code to compute VIF approximation in pixel domain. See https://live.ece.utexas.edu/research/Quality/VIF.htm for details.

Feature Similarity Index Measure (FSIM)

piq.fsim(x: Tensor, y: Tensor, reduction: str = 'mean', data_range: Union[int, float] = 1.0, chromatic: bool = True, scales: int = 4, orientations: int = 4, min_length: int = 6, mult: int = 2, sigma_f: float = 0.55, delta_theta: float = 1.2, k: float = 2.0) → Tensor

Compute Feature Similarity Index Measure for a batch of images.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
data_range – Maximum value range of images (usually 1.0 or 255).
chromatic – Flag to compute FSIMc, which also takes into account chromatic components
scales – Number of wavelets used for computation of phase congruensy maps
orientations – Number of filter orientations used for computation of phase congruensy maps
min_length – Wavelength of smallest scale filter
mult – Scaling factor between successive filters
sigma_f – Ratio of the standard deviation of the Gaussian describing the log Gabor filter’s transfer function in the frequency domain to the filter center frequency.
delta_theta – Ratio of angular interval between filter orientations and the standard deviation of the angular Gaussian function used to construct filters in the frequency plane.
k – No of standard deviations of the noise energy beyond the mean at which we set the noise threshold point, below which phase congruency values get penalized.

Returns:

Index of similarity between two images. Usually in [0, 1] interval. Can be bigger than 1 for predicted \(x\) images with higher contrast than the original ones.

References

L. Zhang, L. Zhang, X. Mou and D. Zhang, “FSIM: A Feature Similarity Index for Image Quality Assessment,” IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2378-2386, Aug. 2011, doi: 10.1109/TIP.2011.2109730. https://ieeexplore.ieee.org/document/5705575

Note

This implementation is based on the original MATLAB code. https://www4.comp.polyu.edu.hk/~cslzhang/IQA/FSIM/FSIM.htm

Spectral Residual based Similarity (SR-SIM)

piq.srsim(x: Tensor, y: Tensor, reduction: str = 'mean', data_range: Union[int, float] = 1.0, chromatic: bool = False, scale: float = 0.25, kernel_size: int = 3, sigma: float = 3.8, gaussian_size: int = 10) → Tensor

Compute Spectral Residual based Similarity for a batch of images.

Parameters:

x – Predicted images. Shape (H, W), (C, H, W) or (N, C, H, W).
y – Target images. Shape (H, W), (C, H, W) or (N, C, H, W).
reduction – Reduction over samples in batch: “mean”|”sum”|”none”
data_range – Value range of input images (usually 1.0 or 255). Default: 1.0
chromatic – Flag to compute SR-SIMc, which also takes into account chromatic components
scale – Resizing factor used in saliency map computation
kernel_size – Kernel size of average blur filter used in saliency map computation
sigma – Sigma of gaussian filter applied on saliency map
gaussian_size – Size of gaussian filter applied on saliency map

Returns:

Index of similarity between two images. Usually in [0, 1] interval.: Can be bigger than 1 for predicted images with higher contrast than the original ones.

Return type:

SR-SIM

Note

This implementation is based on the original MATLAB code. https://sse.tongji.edu.cn/linzhang/IQA/SR-SIM/Files/SR_SIM.m

Gradient Magnitude Similarity Deviation (GMSD)

piq.gmsd(x: Tensor, y: Tensor, reduction: str = 'mean', data_range: Union[int, float] = 1.0, t: float = 0.00261437908496732) → Tensor

Compute Gradient Magnitude Similarity Deviation.

Supports greyscale and colour images with RGB channel order.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
data_range – Maximum value range of images (usually 1.0 or 255).
t – Constant from the reference paper numerical stability of similarity map.

Returns:

Gradient Magnitude Similarity Deviation between given tensors.

References

Wufeng Xue et al. Gradient Magnitude Similarity Deviation (2013) https://arxiv.org/pdf/1308.3052.pdf

Multi-Scale Gradient Magnitude Similarity Deviation (MS-GMSD)

piq.multi_scale_gmsd(x: Tensor, y: Tensor, data_range: Union[int, float] = 1.0, reduction: str = 'mean', scale_weights: Optional[Tensor] = None, chromatic: bool = False, alpha: float = 0.5, beta1: float = 0.01, beta2: float = 0.32, beta3: float = 15.0, t: float = 170) → Tensor

Computation of Multi scale GMSD.

Supports greyscale and colour images with RGB channel order. The height and width should be at least 2 ** scales + 1.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
data_range – Maximum value range of images (usually 1.0 or 255).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
scale_weights – Weights for different scales. Can contain any number of floating point values.
chromatic – Flag to use MS-GMSDc algorithm from paper. It also evaluates chromatic components of the image. Default: True
alpha – Masking coefficient. See references for details.
beta1 – Algorithm parameter. Weight of chromatic component in the loss.
beta2 – Algorithm parameter. Small constant, see references.
beta3 – Algorithm parameter. Small constant, see references.
t – Constant from the reference paper numerical stability of similarity map

Returns:

Value of MS-GMSD in [0, 1] range.

References

Bo Zhang et al. Gradient Magnitude Similarity Deviation on Multiple Scales (2017). http://www.cse.ust.hk/~psander/docs/gradsim.pdf

Visual Saliency-induced Index (VSI)

piq.vsi(x: Tensor, y: Tensor, reduction: str = 'mean', data_range: Union[int, float] = 1.0, c1: float = 1.27, c2: float = 386.0, c3: float = 130.0, alpha: float = 0.4, beta: float = 0.02, omega_0: float = 0.021, sigma_f: float = 1.34, sigma_d: float = 145.0, sigma_c: float = 0.001) → Tensor

Compute Visual Saliency-induced Index for a batch of images.

Both inputs are supposed to have RGB channels order in accordance with the original approach. Nevertheless, the method supports greyscale images, which they are converted to RGB by copying the grey channel 3 times.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
data_range – Maximum value range of images (usually 1.0 or 255).
c1 – coefficient to calculate saliency component of VSI
c2 – coefficient to calculate gradient component of VSI
c3 – coefficient to calculate color component of VSI
alpha – power for gradient component of VSI
beta – power for color component of VSI
omega_0 – coefficient to get log Gabor filter at SDSP
sigma_f – coefficient to get log Gabor filter at SDSP
sigma_d – coefficient to get SDSP
sigma_c – coefficient to get SDSP

Returns:

Index of similarity between two images. Usually in [0, 1] range.

References

L. Zhang, Y. Shen and H. Li, “VSI: A Visual Saliency-Induced Index for Perceptual Image Quality Assessment,” IEEE Transactions on Image Processing, vol. 23, no. 10, pp. 4270-4281, Oct. 2014, doi: 10.1109/TIP.2014.2346028 https://ieeexplore.ieee.org/document/6873260

Note

The original method supports only RGB image. See https://ieeexplore.ieee.org/document/6873260 for details.

DCT Subband Similarity (DSS)

piq.dss(x: Tensor, y: Tensor, reduction: str = 'mean', data_range: Union[int, float] = 1.0, dct_size: int = 8, sigma_weight: float = 1.55, kernel_size: int = 3, sigma_similarity: float = 1.5, percentile: float = 0.05) → Tensor

Compute DCT Subband Similarity index for a batch of images.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
data_range – Maximum value range of images (usually 1.0 or 255).
dct_size – Size of blocks in 2D Discrete Cosine Transform. DCT sizes must be in (0, input size].
sigma_weight – STD of gaussian that determines the proportion of weight given to low freq and high freq. Default: 1.55
kernel_size – Size of gaussian kernel for computing subband similarity. Kernels size must be in (0, input size]. Default: 3
sigma_similarity – STD of gaussian kernel for computing subband similarity. Default: 1.55
percentile – % in (0, 1] of the worst similarity scores which should be kept. Default: 0.05

Returns:

Index of similarity between two images. In [0, 1] interval.

Return type:

DSS

Note

This implementation is based on the original MATLAB code (see header). Image will be scaled to [0, 255] because all constants are computed for this range. Make sure you know what you are doing when changing default coefficient values.

Haar Perceptual Similarity Index (HaarPSI)

piq.haarpsi(x: Tensor, y: Tensor, reduction: str = 'mean', data_range: Union[int, float] = 1.0, scales: int = 3, subsample: bool = True, c: float = 30.0, alpha: float = 4.2) → Tensor

Compute Haar Wavelet-Based Perceptual Similarity Inputs supposed to be in range [0, data_range] with RGB channels order for colour images.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
data_range – Maximum value range of images (usually 1.0 or 255).
scales – Number of Haar wavelets used for image decomposition.
subsample – Flag to apply average pooling before HaarPSI computation. See references for details.
c – Constant from the paper. See references for details
alpha – Exponent used for similarity maps weighting. See references for details

Returns:

HaarPSI Wavelet-Based Perceptual Similarity between two tensors

References

R. Reisenhofer, S. Bosse, G. Kutyniok & T. Wiegand (2017) ‘A Haar Wavelet-Based Perceptual Similarity Index for Image Quality Assessment’ http://www.math.uni-bremen.de/cda/HaarPSI/publications/HaarPSI_preprint_v4.pdf

Code from authors on MATLAB and Python https://github.com/rgcda/haarpsi

Mean Deviation Similarity Index (MDSI)

piq.mdsi(x: Tensor, y: Tensor, data_range: Union[int, float] = 1.0, reduction: str = 'mean', c1: float = 140.0, c2: float = 55.0, c3: float = 550.0, combination: str = 'sum', alpha: float = 0.6, beta: float = 0.1, gamma: float = 0.2, rho: float = 1.0, q: float = 0.25, o: float = 0.25)

Compute Mean Deviation Similarity Index (MDSI) for a batch of images. Supports greyscale and colour images with RGB channel order.

Parameters:

x – An input tensor. Shape \((N, C, H, W)\).
y – A target tensor. Shape \((N, C, H, W)\).
data_range – Maximum value range of images (usually 1.0 or 255).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
c1 – coefficient to calculate gradient similarity. Default: 140.
c2 – coefficient to calculate gradient similarity. Default: 55.
c3 – coefficient to calculate chromaticity similarity. Default: 550.
combination – mode to combine gradient similarity and chromaticity similarity: 'sum' | 'mult'.
alpha – coefficient to combine gradient similarity and chromaticity similarity using summation.
beta – power to combine gradient similarity with chromaticity similarity using multiplication.
gamma – to combine gradient similarity and chromaticity similarity using multiplication.
rho – order of the Minkowski distance
q – coefficient to adjusts the emphasis of the values in image and MCT
o – the power pooling applied on the final value of the deviation

Returns:

Mean Deviation Similarity Index (MDSI) between 2 tensors.

References

Nafchi, Hossein Ziaei and Shahkolaei, Atena and Hedjam, Rachid and Cheriet, Mohamed (2016). Mean deviation similarity index: Efficient and reliable full-reference image quality evaluator. IEEE Ieee Access, 4, 5579–5590. https://arxiv.org/pdf/1608.07433.pdf, DOI:10.1109/ACCESS.2016.2604042

Note

The ratio between constants is usually equal \(c_3 = 4c_1 = 10c_2\)

Note

Both inputs are supposed to have RGB channels order in accordance with the original approach. Nevertheless, the method supports greyscale images, which are converted to RGB by copying the grey channel 3 times.

No Reference Metrics

Total Variation

piq.total_variation(x: Tensor, reduction: str = 'mean', norm_type: str = 'l2') → Tensor

Compute Total Variation metric

Parameters:

x – Tensor. Shape \((N, C, H, W)\).
reduction – Specifies the reduction type: 'none' | 'mean' | 'sum'. Default:'mean'
norm_type – 'l1' | 'l2' | 'l2_squared', defines which type of norm to implement, isotropic or anisotropic.

Returns:

Total variation of a given tensor

References

https://www.wikiwand.com/en/Total_variation_denoising

https://remi.flamary.com/demos/proxtv.html

Feature Metrics

Inception Score (IS)

piq.inception_score(features: Tensor, num_splits: int = 10)

Compute Inception Score for a list of image features. Expects raw logits from Inception-V3 as input.

Parameters:

features (torch.Tensor) – Low-dimension representation of image set. Shape (N_samples, encoder_dim).
num_splits – Number of parts to divide features. Inception Score is computed for them separately and results are then averaged.

Returns:

score

variance

References

“A Note on the Inception Score” https://arxiv.org/pdf/1801.01973.pdf

Functional Interface

Full Reference Metrics

Peak Signal-to-Noise Ratio (PSNR)

Structural Similarity (SSIM)

Multi-Scale Structural Similarity (MS-SSIM)

Information Content Weighted Structural Similarity (IW-SSIM)

Visual Information Fidelity (VIFp)

Feature Similarity Index Measure (FSIM)

Spectral Residual based Similarity (SR-SIM)

Gradient Magnitude Similarity Deviation (GMSD)

Multi-Scale Gradient Magnitude Similarity Deviation (MS-GMSD)

Visual Saliency-induced Index (VSI)

DCT Subband Similarity (DSS)

Haar Perceptual Similarity Index (HaarPSI)

Mean Deviation Similarity Index (MDSI)

No Reference Metrics

Total Variation

Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE)

Feature Metrics

Inception Score (IS)